|Full name of the head||Degree and position of the head||Email manager, from which he responds to letters||Kind of practice||Place of internship||Place internship (if other)||Internship Topic||The number of students you are willing to take on this topic||Practice assignment||Detailed description of the content of the practice||Prerequisites (if any)|
|Ratnikov Fedor Dmitrievich||Leading Researcherfirstname.lastname@example.org||Research||HSE||remotely||Determination of amplitude-time properties of a signal by Machine Learning methods||2||https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing||https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problems 1 and 2||ML|
|Ratnikov Fedor Dmitrievich||Senior Researcheremail@example.com||Research||HSE||remotely||Construction of a generative model that reproduces the amplitude and temporal properties of the signal||1||https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problem 3||https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problem 3||ML, VAE, GAN|
|Ershov Egor Ivanovich||Ph.D., Junior Researcherfirstname.lastname@example.org||Program project (all courses PI and PMI)||Other||IITP RAS||Color imaging conveyor: white balance. (three internships: one algorithm - one student)||3||Implement one of the classic white balance algorithms||When photographing, the mobile phone performs color correction according to the characteristics of the lighting in the scene. This is done so that when viewing the image from the phone in a different place the colors do not seem unnatural. Usually try to find the characteristics of the source in the scene and replace them with the characteristics of the "white" source. This is done programmatically and is called automatic white balance. During these two weeks, you will get basic skills in working with images, understand the differences in image storage formats, and write your own algorithm for automatic white balance.||basic python knowledge|
|Ershov Egor Ivanovich||Ph.D., Junior Researcheremail@example.com||Research||Other||IITP RAS||Three-dimensional fast Hough transform: a study of accuracy||2||To release the construction of dyadic straight lines and planes. Get data on the distribution of their errors. Try to describe the pattern||The fast Hough transform is a fast (faster impossible) algorithm for calculating sums over all discrete (specific) straight lines in an image. In our laboratory, we study the specificity of these discrete straight lines, including for the three-dimensional case. On this internship, you will learn a new algorithm, perform a number of new characteristics of the family of these algorithms, with the help of computational experiments and, perhaps, improve your level of knowledge of the MATLAB language. In addition, the problem with a star will be an attempt to analytically describe the observed effects.||Matlab|
|Gurvich Vladimir Aleksandrovich||PhD, Senior Researcherfirstname.lastname@example.org||Training (1, 2 year PMI, 1 year PI)||HSE||Elements of Game Theory||5||Studying the theory and solving problems on the topics: Combinatorial Games, Nash solvability.||Lectures and literature will be sent by e-mail to all participants.||Knowledge of the basic definitions of the Theory of Graphs is useful, but not necessary.|
|Yakovlev Konstantin Sergeevich||Ph.D., associate professor, leading email@example.com||Program project (all courses PI and PMI)||HSE||Autonomous navigation of mobile agents||5||1. The study of sources devoted to the planning of the trajectory.|
2. The choice of the task for the decision and the method of the decision (in coordination with the head):
- trajectory planning in 2D using visibility graphs
- trajectory planning in 2D using Voronoi diagrams
- trajectory planning in 2D using graphs of regular decomposition
- trajectory planning in 2D with kinematic constraints
- trajectory planning in 3D
- trajectory planning for groups of agents
- and etc.
|Trajectory planning is one of the fundamental abilities of any intelligent agent (robot, character in a computer game, etc.), so the methods of planning spatial movements have always been (and continues to be) much attention in artificial intelligence. Despite the seeming simplicity, this is a non-trivial (challenging) task, the solution of which requires not only a confident knowledge of mathematical and computer sciences, but also ingenuity.|
In the scientific laboratory in which I work, my colleagues and I have been dealing with this interesting topic for several years: we conduct research, publish in scientific journals, and give reports on our new developments at conferences and seminars. I suggest that you also get acquainted with the area of path finding and master the basic algorithms that can be further developed, improved and applied in real systems (navigation services, robotics, etc.).
|C / C ++, Eng. lang|
|Zykov Sergey Viktorovich||Doctor of Technical Sciences, Professorfirstname.lastname@example.org||Research||HSE||Overview of modern methods of project risk analysis based on decision-making technologies||2||Review of modern methods of risk analysis in project development. Analysis of the most common decision-making technologies. Recommendations on the application of the considered methods to the studied problems.|
|Zykov Sergey Viktorovich||Ph.D., email@example.com||Research||HSE||Examining the capabilities of software products for project risk analysis||1||Overview of modern software products used for risk analysis in project development. Development of recommendations on the peculiarities of the use of specific products in various cases with respect to the subject matter.|
|Aizenberg Anton Andreevich||Ph.D., associate firstname.lastname@example.org||Research||HSE||Constructing diagrams of orbifold features for the misorientation spaces in polycrystalline materials.||2||The minimum task to practice: write a program on a python, or whatever, according to my algorithm. The program will receive two discrete subgroups of the multiplicative group of quaternions as input (most likely, the subgroup will simply be given a list of its elements, that is, a set of 4-dimensional vectors, up to 120 pieces). At the output, the program should produce a three-dimensional image depicting a certain set of circles, ideally with some marks. It should be possible to rotate and zoom these three-dimensional images, and in the future - put them in open access.||Despite the terrible name, the minimum task for practice: to write a program on a python, or whatever, according to my algorithm. The program will receive two discrete subgroups of the multiplicative group of quaternions as input (most likely, the subgroup will simply be given a list of its elements, that is, a set of 4-dimensional vectors, up to 120 pieces). At the output, the program should produce a three-dimensional image depicting a certain set of circles, ideally with some marks. It should be possible to rotate and zoom these three-dimensional images, and in the future - put them in open access.|
The maximum task: to additionally deal with the theory and offer their ideas. To study the concepts of quaternions, the double covering of an orthogonal group by the three-dimensional sphere of quaternions, crystallographic groups and their binary extensions, the formulation and essence of the Poincare hypothesis. In short, explore the basic concepts of low-dimensional topology. And also to understand the concept of the space of misorientations that arises in materials science.
|Basics of linear algebra, good geometric imagination. The desire to learn something abstract, like quaternions, undersized topology, or the basics of crystallography|
|Aizenberg Anton Andreevich||Ph.D., associate email@example.com||Research||HSE||Numerical work with the permutohedron volume polynomial and other Voronoi regions||2||In the minimal version of the summer practice, it will be necessary to test one hypothesis about the polynomial of the volume of a permutohedron. To do this, you need to take a few steps: (1) understand what a permutohedron is, (2) understand what a volume polynomial is, (3) understand how to calculate the volume of a permutohedron. The latter is done under the article by Alex Postnikov (article in English). The article contains several combinatorial formulas - which of them is more convenient to use - this will have to be considered. It will be necessary to verify some identity for the polynomial of the volume of the permutohedron in those dimensions with which the computer can cope. Maximum program: understand the Voronoi weighted diagrams and understand the meaning of the identity, as well as conduct a number of computational experiments for weighted Voronoi diagrams.||Article: Alexander Postnikov, Permutohedra, associahedra, and beyond, it will be necessary to selectively read it in the process.||Knowledge of linear algebra, general understanding of what a permutation group is. The ability (or the ability to learn) to program on the python (or somewhere else) work with abstract polynomials and rational functions of several variables. The desire to learn a lot about convex polyhedra.|
|Aizenberg Anton Andreevich||Ph.D., associate firstname.lastname@example.org||Research||HSE||Calculation of the homology characteristics of a dating graph.||2||Ideally, the following should be done during the practice. (1) Understand the concepts of homology and stable homology, at least one-dimensional. It is possible to disassemble the work in which the homologous stages of the correlation graph of brain regions are built (this work is not a special topological one, and there many things are written relatively understandably). (2) Get familiar with the package that computes (stable) homology and cycles. (3) Spars from somewhere dating graph, for example, students of PCF. (4) Apply an algorithm to this graph and evaluate its topological significance for each edge of the dating graph. (5 *) [this, most likely, does not apply to topological matters, but is generally interesting] Is it possible to understand with great accuracy, according to the dating column of students of the FKN, which student is the training assistant?||Article Petri, Expert, ... Homological scaffolds of brain functional|
networks, you will need to partially disassemble it and apply a similar technique for analyzing dating graphs
|It would be ideal to already know something about algebraic topology in general and homology in particular. If not, the desire to deal with these matters. The ability to quickly get used to new concepts, algorithms and their implementation (for example, on python)|
|Aizenberg Anton Andreevich||Ph.D., associate email@example.com||Research||HSE||Topology and activity analysis of neuronal loci in the mouse hippocampus||3||There are several parallel tasks here. The initial setting for everyone is as follows: the mouse runs for several minutes through the "labyrinth," at which time the camera shoots a certain area of its brain, on which individual neurons are visible. According to the video, a table was constructed in which the activity of the i-th neuron (out of 600) was recorded at the j-th instant of time (out of several thousand). The work consists in the construction and analysis of topological objects associated with this table. (1) In the previous work, 20 neurons were selected that are the best candidates for the title of "place neurons". It takes about 10 of them, so that the corresponding connected place fields cover the maze, and see if it is possible, knowing only these 10 neurons, and having no information about the actual position of the mouse, to restore the remaining 10 neurons of the place (you can try to do this as an analysis data, so and combinatorially-topologically, there is one idea, how to approach this). Try to calculate in a similar way the neurons of a place with disconnected place fields. (2) Having all 20 neurons of the site, and a time series of their activity, restore the trajectory of the mouse movement. Compare with the real trajectory - did you guess correctly? (3) It will be necessary to disassemble one of the two articles related to the relevant tasks, and talk about it at the seminar. This task is more educational than research, but only for 3-course students and higher. Articles in English and seemingly difficult. (4) * - with some probability, by the end of July, we will have new data on mice in more complex mazes. It will be necessary to carry out the initial processing of this data, to select the neurons of the place, to describe the corresponding place fields, etc. But this is if the data appear.||Articles: introductory, about that in general speech: Curto, Itskov, Cell Groups Reveal Structure of Stimulus Space. Beforehand, it would be good to read on Wikipedia what neurons are in a place, and what is the Alexandrov theorem on nerve covering (Nerve Theorem).|
More cool articles on analysis: Erik Rybakkena, Nils Baasa, and Benjamin Dunnb, Decoding of neural data using cohomological feature extraction
Rubin, Sheintuch, ..., Revealing neural correlates of behavior without behavioral measurements
|You need to have some kind of general understanding in one of the 3 areas: statistical data processing, topology, neurobiology, and the desire to pump the remaining two. This task is difficult, but you will learn many new things both in theory and in practice, and you will be able to work with real data.|
|Aizenberg Anton Andreevich||Ph.D., associate professor DBDiIPfirstname.lastname@example.org||Research||HSE||Software implementation of zigzag-resistance and homologous scaffolding in fMRI data||3||In topology, there is the classic concept of homology groups and Betty numbers. In applied topology, which has been around for about 15 years, there is the concept of stable homology - it allows you to analyze the topology of spaces by statistical methods. The theory of stable homology is based on the classification of modules over a ring of polynomials - a classic story in higher algebra. Relatively recently, the concept of zigzag-stability has appeared, which allows exploring dynamically changing graphs and hypergraphs using topology and statistics methods. The practice offers several tasks. (1) Theory: to study (if necessary) what stable homologies are, and also to analyze Carlson's article on zigzag-stability. (2) Theory / practice: deal with how zigzag-stability computation is implemented in the Python package dyonisus, working with stable homology. (3) Practice: calculate zigzag-resistant diagrams for correlation graphs obtained according to fMRI, describe homologous scaffolding. The last task is the most difficult, because in some way it involves understanding the essence of what is happening. But it is also the most interesting.||The book about homology and stable homology Oudot Persistent homology (although it is better to start homology with something more basic, for example, with Wikipedia).|
Article on the zigzag-resilience of Carlsson, de Silva, Morozov, Zigzag Persistent Homology and Real-valued Functions
An article on homologous scaffolding in correlation graphs obtained by fMRI: Petri, Expert, ..., Homological scaffolds of brain functional networks
|Ideally, for item 1, knowledge of the fundamentals of homological algebra. Otherwise, good knowledge of linear algebra, lack of fear of large diagrams in vector spaces and linear mappings between them. For p.2-3 - the ability to program well, and to deal with the work of the new Python packages. For p.3 desire to work with real brain data.|
|Aizenberg Anton Andreevich||Ph.D., associate email@example.com||Program project (all courses PI and PMI)||HSE||Numerical work with cobordism classes of toric varieties||1||Despite the abundance of incomprehensible words, the work will be clear and accessible, including to students, who have completed the first year with good grades. It will be necessary to make computer calculations on a set of algorithms which I have. The following describes the calculations, and in brackets their meaning (in fact, the mathematical motivation of work, to learn how to understand it would be cool, but not necessary).|
1) According to a set of combinatorial data (a fan of a toric variety), construct a polynomial in several variables (a polynomial of the volume of a toric variety).
2) Calculate a specific set of partial derivatives of the resulting polynomial (counting the characteristic numbers of a toric variety).
3) Perform these calculations for a set of initial data (products of projective spaces, that is, the basis of the complex cobordism ring, as well as for permutohedral manifolds in small dimensions).
4) Solve a system of linear equations depending on the numbers found (express a permutohedral manifold in terms of the cobordism rings).
5) Check the coincidence of some numbers (check the Buchstaber hypothesis about the relationship between the structure of the exponent of the formal group of complex cobordism and permutohedral varieties - most likely it is not true, but difficult to check)
6) Provide sources so that in step (3) I can independently change the input data.
|I think that I will write a detailed memo describing all the algorithms and calculations that need to be implemented||Good knowledge of all subjects that are studied in the 1st year of PMI. The ability to program all sorts of abstract things, like rational numbers, polynomials, partial derivatives.|
|Osipov Dmitry Sergeevich||Ph.D. assistant firstname.lastname@example.org||Research||Other||IITP RAS||Error Correcting Cryptosystem||2||1. To study a cryptosystem on the basis of a jam-resistant code described in the work of Esmaeli-Gulliver 2. To execute the software implementation of one of the variants of such a cryptosystem||The task of the practice is to familiarize students with the methods of the theory of codes that correct errors, and the prospects for their use in post-quantum cryptography.||Knowledge of English in a volume that allows you to read and understand scientific texts. Basic knowledge of linear algebra and programming in Matlab|
|Osipov Dmitry Sergeevich||Ph.D., senior email@example.com||Research||Other||IITP RAS||Methods of coding theory in data storage||2||1. Acquaintance with coding theory methods and their use in data storage systems 2. Software implementation of code design for data storage||The task of the practice is to acquaint students with the basic concepts of coding theory and coding theory methods used in modern data storage systems||Knowledge of English in a volume that allows you to read and understand scientific texts. Basic knowledge of linear algebra and programming in Matlab|
|Rygaev Ivan Petrovich||Researcherfirstname.lastname@example.org||Program project (all courses PI and PMI)||Other||Laboratory of Computer Linguistics IITP RAS||Backward chaining reasoning||1||Develop an algorithm for the inverse of inference.||In the linguistic processor ETAP-3 a system of semantic text analysis is being developed, which allows making automatic conclusions based on the processed text. The text is converted into a semantic (logical) representation, which is stored in the database. Then the rules of logical inference (implications of the "if-then" type) are applied to this view, which replenish the database with the conclusions drawn. Then in such an extended structure, you can search for answers to questions for which there are no explicit answers in the text.|
For example, having received the sentence “Vasya bought a car” at the entrance, the withdrawal rules supplement the database with knowledge corresponding to the sentences “Vasya got the car”, “Vasya gave the money”, “Vasya has a car”, etc. By asking the question “Vasya has a car? " we get the answer "Yes", because the answer is contained in the extended structure. This is a direct inference algorithm.
The reverse logic inference algorithm should work the other way around, not from data, but from a request. Instead of expanding the data, it expands the query so that the necessary answer can be obtained from the basic (non-extended) data structure. Implications are processed in the opposite direction - from "that" to "if". In our example, having received the question "Vasya has a car?" the system must first search for it in the base structure. Not having received the answer, should look in the implications, what other events may lead to "Vasya has a car"? Such events will be including "Vasya got the car" and "Vasya bought the car." Thus, the query expands: "Vasya has a car? Or did Vasya receive the car? Or did Vasya buy the car?" Each of these subqueries is looked for separately in the base structure, and we receive a positive response to the third subquery, and therefore to the original query too.
The student will be asked to develop an algorithm for the inverse of inference in the system ETAP.
|Rygaev Ivan Petrovich||Researcheremail@example.com||Program project (all courses PI and PMI)||Other||Laboratory of Computer Linguistics IITP RAS||Lexical game||2||Develop a web interface for the game of guessing lexical functions||In a language, some meanings are expressed in different words depending on the word with which the source is associated. For example, the meaning of "a large degree of something" in Russian is expressed as: if we are talking about rain, then we will say pouring, if it is a disease - severe, if fog is thick, etc. This is called the lexical function f (x ) = y. In this example, this function is called MAGN: MAGN (rain) = pouring, MAGN (disease) = severe, MAGN (fog) = thick, etc. There are several dozen of such functions. Other examples include the noun - adjective (f (squirrel) = squirrel, f (crow) = crow), private - general (f (butterfly) = insect, f (dove) = bird), animal - the sound produced (f ( cat) = meow, f (dog) = barks), etc.|
We plan to make a game for guessing lexical functions. In it, the user is shown a pair (x, y) for a certain lexical function. The function itself is not specified. And it is proposed to guess y for another x, meaning the same function. For example: "the cat meows, and the dog -?". The prototype of the game can be viewed at http://jent.ru/lexy/
The student will be invited to participate in writing either the user web interface for the game, or its server part. We are talking about software mechanisms and algorithms. Content (words with lexical functions) will be provided to the student.
|Makarov Sergey Lvovich||associate professor tech. of firstname.lastname@example.org||Program project (all courses PI and PMI)||HSE||A program that implements the method of Veitch diagrams||1||The study of the method of Veitch diagrams|
Search and analysis of analogues
Development of a program / application (desktop, web, mobile - not important)
Preparation of a practice report
|You can just google: https://vuzlit.ru/899465/metod_diagramm_veycha|
The program should be implemented either using recursion or in the usual way, but it should be able to minimize Boolean functions and produce results for functions from 2 to 10 arguments, while the result should be in human form, ie, for example, abc + c (not) de, where not is a negation, standing on top of the letter d in the form of a dash.
|Makarov Sergey Lvovich||associate professor tech. of email@example.com||Program project (all courses PI and PMI)||HSE||Internet of Things smart street lamp||1||Develop an electrical circuit|
Learn to work with Raspberry Pi 3 and sensors
Learn to work with Android Things
Learn to work with any cloud, for example Firebase or Artik Cloud
Develop a program / application (web, desktop, mobile), which will display the cycles on and off the lamp
|The scheme includes the Raspberry Pi 3, a motion sensor, some kind of flashlight or lamp, a relay. When motion is detected, the sensor sends a signal to the raspberry, it switches the relay and the lamp lights up, while the data on how much the lamp was on, when it turned on and when it turned off are stored in the cloud and accessible via the application (where you can plot the activity and calculate energy (watts) was spent on lighting for a period of time)||Programming|
|Makarov Sergey Lvovich||associate professor tech. of firstname.lastname@example.org||Program project (all courses PI and PMI)||HSE||Using analytics tools in the finished application||1||Learn to use various tools for application analytics (minimum 3).|
Embed tools into the finished application
Get data analysts for 2 weeks and compare them, draw conclusions
Write a practice report
|There is a ready application (desktop / mobile / web). We take several tools for analytics (Flurry, Crashlytics, Mixpanel, Yandex.Metrica, Google Analytics, Firebase Analytics, etc.) - at least 3, and implement it in the application, and then analyze the results and draw conclusions - which tool is better worse, what's the difference, what do tools allow you to do, what parameters do you analyze (retention rate, session length, number of installations, number of users, LTV, number of crashes, etc.)||Programming|
|Kreshchuk Alexey Andreevich||Ph.D., senior researcher IITP RASemail@example.com||Program project (all courses PI and PMI)||Other||IITP RAS||Development of a library for working with end fields in C ++ 17||2||Finite fields are often used in cryptography and coding theory. Although there are many libraries that implement operations in finite fields, most of them do not develop, and the rest do not have sufficient flexibility. In this project it is proposed to develop a library for working with end-fields of characteristic 2. Main subtasks:|
1. Develop a convenient API, which makes it possible to use both static typing and type erasure.
2. Investigate various ways of representing finite fields and determine the effectiveness of their use for fields of different order.
3. Develop ways to store elements of finite fields, combining low memory consumption and high speed.
|Finite fields are the basis for all modern algebraic coding theory and public key cryptography. Despite the simplicity and elegance of the mathematical description, the practical implementation of operations on their elements is not so trivial. From various representations of a finite field as a chain of extensions of a trivial (here a Boolean) field, various multiplication algorithms and different ways of storing elements follow. To create a universal library, it is necessary to create a constructor from various algorithms and to provide the user with the opportunity to choose the presentation most suitable for his case. In addition, many algorithms require precalculations, which creates additional difficulties in storing references to the result of these precalculations. All this makes the implementation of the library for working with finite fields an interesting task from the point of view of both design and implementation.||Requires programming skills in C ++|
|Kreshchuk Alexey Andreevich||Ph.D., senior researcher IITP RASfirstname.lastname@example.org||Program project (all courses PI and PMI)||Other||IITP RAS||Development of a library for receiving and transmitting signals in Pluto SDR in C ++ 17||1||ADALM Pluto is a modern platform for transmitting and receiving arbitrary signals from computer to computer, intended primarily for learning. The manufacturer provides an API for C, but not for C ++. In this project, it is proposed to create a convenient API that will later be used to teach students Digital Signal Processing. Main subtasks:|
1. Develop a convenient API for working with software-defined Pluto SDR radio in C ++ 17 (or C ++ 2a).
2. Implement this API using the existing C language API.
3. Write a test application that transmits and receives an analog signal.
|Currently, APIs for C and Python languages, as well as blocks for Simulink, GNU Radio, and LabView systems are offered for working with Pluto SDR. As practice has shown, using Python for signal processing is not very convenient, as it is not well suited for real-time applications. The C interface is at the same time lower level. For more convenient work with Pluto SDR, you need to develop an interface that combines the advantages of an API for C and Python and uses all the features of C ++ 17 to create the most convenient abstractions.||Requires programming skills in C ++|
|Posypkin Mikhail Anatolyevich||Doctor of Physics and Mathematics, email@example.com||Research||Other||FIZ IU RAS (Vavilova, 40)||Experimental study of the effectiveness of optimization algorithms||4||Develop a methodology for experimental comparison of various optimization algorithms and make such a comparison.||When comparing the effectiveness of optimization algorithms, many difficulties arise. First, it is the diversity of comparison criteria - the speed of work, robustness, accuracy of finding the minimum. Secondly, it is a dependence on the function being optimized, as well as on the initial approximation. It requires the development of effective methods for comparing functions, developing instrumental support for comparing in the form of scripts in python, modules for visualizing results, planning and conducting a computational experiment.|
|Pantyukhin Dmitry Valerievich||Senior Lecturerfirstname.lastname@example.org||Program project (all courses PI and PMI)||HSE||Development of the assistant program secretary of the Commission on the Protection of the Kyrgyz Republic / WRC||1||Develop a program to assist the Secretary of the Commission on the Protection of the Kyrgyz Republic \ WRC, which has the following functionality:|
1) connection to a certain discipline of the LMS (for reading) under the secretary’s account
2) uploading students' materials ("project") presented as an archive (zip, rar), saving to disk
3) Analysis of the content of each project in order to determine the completeness of the files contained (documentation, presentation, manager's review, etc.)
4) output to the screen \ save to xls file and print a pivot table with the results of the analysis.
The program should:
- allow to specify the discipline LMS,
- keep the password protected;
- work in conditions of poorly structured archives
- allow viewing the contents of downloaded files
- allow manual editing of table fields.
An additional requirement is the analysis of the manager's response, represented by the scan-image and recognition of the set assessment.
|The student will receive a fat plus sign in karma and thanks to the secretaries of the commissions for facilitating their hard work.|
|Rygaev Ivan Petrovich||Researcheremail@example.com||Program project (all courses PI and PMI)||Other||Laboratory of Computer Linguistics IITP RAS||SynTagRus in Universal Dependencies format||2||Develop a script for converting syntactically sized SynTagRus package into Universal Dependencies format||SynTagRus is currently the largest corpus (collection of texts) in Russian with syntax markup. Not only parts of speech and morphological indicators of words (such as gender, case, time, mood, etc.), but also syntactic relations between words (subject, addition, definition, etc.) are marked in the corpus. The set of syntactic relations and morphological characteristics is based on the Igor Melchuk's Meaning-Text theory and corresponds to those used in the ETAP linguistic processor. Recently, however, the Universal Dependencies format (www.universaldependencies.org) has become the de facto standard for syntactic markup. Therefore, the problem arises of converting the SynTagRus package into this format. This task has already been solved once, however, the syntactic relations specific to the Russian language that are present in the corpus have been lost. They were replaced by universal ones, which led to the unification of several types of relationships into one. It is required to re-convert with preservation of all the Russian-specific features of SynTagRus.|
The student will be asked to write a script (or part of it) on Python or another language to convert the SynTagRus package into the Universal Dependencies format. A mapping of syntactic relations and morphological characteristics will be provided.
|Rygaea Ivan Petrovich||Researcherfirstname.lastname@example.org||Program project (all courses PI and PMI)||Other||Laboratory of Computer Linguistics IITP RAS||Regression test for semETAP semantic analyzer||1||Develop a program for comparing new semantic structures with old ones with a display of discrepancies||SemETAP semantic analyzer translates the original sentence in Russian into its semantic representation in the form of a semantic graph or text in a formal, formal language, Etoalog, which reflects the meaning of the sentence. This representation is formed using the rules for converting the syntactic structure into the semantic one and the rules of inference. These rules are constantly constantly being supplemented and changed in order to support the processing of new words and new phenomena of the language. Unfortunately, in addition to the positive impact of these changes can lead to negative effects, and in unexpected places. To keep track of what exactly made the changes, you need to implement a regression test. This test is a set of sentences, each of which is associated with a reference semantic structure. When a test is launched, each sentence is subjected to semantic analysis according to new rules with the receipt of new semantic structures. Further, these structures should be compared with the reference ones with the display of discrepancies on the screen (by analogy with the version control systems).|
The student will be asked to write a program to compare the text of the new semantic structures with the old with the display of discrepancies on the screen
|Irina A. Lomazova and Julio C. Carrasquel||Tenured Professor and the Research Assistantemail@example.com||Research||HSE||To be arranged with the student||Analysis of Trading Systems with Process Mining||1||Analysis of Trading Systems using Process Data Mining: Extracting Knowledge about Processes from Event Data||The London Stock Exchange (LSE) system is an important trading venue for buying / selling orders (company stocks). It can be used to make it very complex. We are interested in those orders.|
You will receive information from the LSE System. During this training.
|Basic skills on any programming language (i.e. Java, Python).|
|Bauwens Bruno Frederik||Assistant firstname.lastname@example.org||Research||HSE||Analysis of the game used to separate the monotone Kolmogorov complexity and algorithmic probability||2||Winning strategy in a game. The existence of a winning strategy is equivalent to an open question. We try strategies for small values. Students can choose the game mathematically.||https://www.dropbox.com/s/j5kvherrtmn69jx/monotoneCompl_vs_AprioriProb.pdf?dl=0|
|Bauwens Bruno Frederik||Assistant email@example.com||Research||HSE||Kolmogorov complexity and foundations of learning theory||3||Studying Kolmogorov complexity, applications in inductive inferrence, solving various exercises from books.|
|Bauwens Bruno Frederik||Assistant firstname.lastname@example.org||Training (1, 2 year PMI, 1 year PI)||HSE||Quantum computation||2||Learn the basic quantum computation||https://homepages.cwi.nl/~rdewolf/qcnotes.pdf|
|Parinov Andrey Andreevich||junior researcher, senior email@example.com||Program project (all courses PI and PMI)||HSE||ISSA NUMBERS||Collection and analysis of data on entering universities in Russia||5|
It is required to write programs for collecting data from the websites of the universities of Moscow on those entering the universities.
|Knowledge of programming language Python 3.|
Shapoval Alexander Borisovich
Training (1, 2 year PMI, 1 year PI)
|HSE||Application of the principal component method to the analysis of the effectiveness of political elites||3|
Examine the principal component method.
"Clear" the original data.
Apply the principal component method to the data.
Write a report.
Interest in mathematical analysis and political economy
|Derkach D.A.||senior firstname.lastname@example.org||Research||HSE||search for supernovae||5||One of the urgent tasks of astronomy is the classification of various types of objects according to their light curves, that is, the dependence of their brightness on time. The peculiarities of such astronomical time series are their large heterogeneity and the unknown moment of the beginning of the report. The heterogeneity lies in the fact that between the observations of the same object a different time passes, and sometimes quite a lot. And the beginning of the report is unknown because it is difficult to understand exactly when a certain astronomical event began to occur, observing only a part of it.|
You are invited to explore the possibilities of machine learning methods for classifying some of the most powerful explosions in the Universe - supernovae. You have to get acquainted with several algorithms and improve them.
|Stages of work:|
1) Study of the data, plotting the curves of the light curves
2) Implementation of the “naive” algorithm based on the method of extracting features used by experts, bringing the accuracy of the binary classification to 0.6.
3) Study and implementation of one of the proposed algorithms presented in the literature or in completed Kaggle competitions.
4) Improving the accuracy of binary classification to 0.9
|python, scikit-learn, pytorch|
|Ustyuzhanin A.E.||senior email@example.com||Program project (all courses PI and PMI)||HSE||development of a bot for analyzing the operation of AI algorithms||5||This project will develop bots that simulate simple user actions for several Internet resources (google, yandex, bing, yahoo). Such a bot will allow to quantify the differences in the output of materials depending on the history of user behavior.||Measuring the degree of distortion (community polarization) could form the basis for building a global polarization index (by countries and systems). Such a summary is very important both to the companies themselves as feedback and to the community to assess the influence of algorithms on the formation of trends and opinions.|
The first step in work on impact assessment is to quantify differences depending on the strategy of user behavior. This project aims to collect primary statistics for such a study.
The project is conducted in collaboration with the Faculty of Sociology.
https://medium.com/@sergey_57776/bitv- gaz-earth- which- may- be- played- e3bc5cad5e63,
|python, experience with the packages of parsing internet resources|
|Ustyuzhanin A.E.||senior firstname.lastname@example.org||Research||HSE||Restoration of particle properties from the tracks from the Time Projection Chamber detectors||5||Many neutrino detectors use a special type of sensitive element - the Time Projection Chamber, in which charged particles drift under the action of an electro-magnetic field. The specificity of such detectors lies in the fact that the delay in the registration of a signal is related to the distance traveled by the particle. Reconstruction algorithms for such detectors work well in conditions of low particle density. In the course of practical work, it is necessary to develop an algorithm for reconstructing the configuration of a high density initial event from the detector signals as much as possible.||python, scikit-learn, pytorch|
|Ustyuzhanin A.E.||senior email@example.com||Research||HSE||Recovery of 3D structure by X-ray images||5||Restoration of the 3D structure of complex objects is an urgent task for biologists, chemists, and solid state physicists. Modern accelerator facilities allow radiation to be shown to objects of interest with certain characteristics. In the process of shooting, the object rotates, leaving a series of digital projections. For this series of tomograms, it is necessary to restore the bulk density map of the original object.|
In the course of this project, it is necessary to learn to determine the base characteristic - the center of rotation of the original object.
|Data provided by the XFEL experiment. The project is conducted together with representatives of the experiment.||python, scikit-learn, pytorch|
|Ustyuzhanin A.E.||senior firstname.lastname@example.org||Research||HSE||Search for extended galactic objects||5||The project aims to develop an algorithm to search for extended galactic objects in data collected by the FERMI telescope. A weak signal, a high level of instrumental noise and the presence of a galactic background are key detection difficulties. The search for such sources is one of the central tasks of the FERMI collaboration and the astronomical community as a whole. Of greatest interest are regions close to the galactic center, where the distribution of the density of the galactic background is an open question. In the course of the project, it is necessary to become familiar with the current algorithms for separating the signal from the background and developing / improving them to achieve the required level of accuracy.||Data provided by the FERMI experiment||python, scikit-learn, pytorch|
|Borisyak M.A.||junior email@example.com||Research||HSE||Development of gradient boosting algorithm for the detection of anomalies||5||Anomaly detection is one of the important tasks of data analysis. Its use can be found both in protection against cyber attacks or in the detection of suspicious transactions, and in building reliable production quality control systems. Classical gradient-boosting algorithms for trees work well in the case of a balanced selection of shared classes. Additional mechanisms are usually used to detect anomalies (rare events). One of them may be the artificial generation of rare events based on a) the available data and b) the work of the pre-trained model. The project will recall the principle of operation of gradient boosting algorithms over trees and become familiar with the approach developed in our laboratory to adapt neural networks to detect anomalies and expand it to work with trees.||https://www.dropbox.com/s/kxri6cxq9yb25ga/Borisyak_Boosting_for_anomaly_detection.pdf?dl=0||python, scikit-learn, pytorch|
|Ustyuzhanin A.E.||senior firstname.lastname@example.org||Research||HSE||Reconstruction of the quantum states of modern quantum processors||5||This project will have to learn how to effectively evaluate the state of an n-dimensional quantum system with modern methods of machine learning.|
The procedure for determining the state parameters of a quantum system from measurement data is an integral part of experiments to create a quantum computer. Algorithms of quantum tomography of a state allow performing the procedure of reconstruction of complex parameters of a quantum state, using measurement data. The number of parameters in this case (the dimension of the Hilbert space) grows exponentially with increasing dimension (the number of qubits) as 2 ^ n. Due to such a rapid growth in the dimension of the parameter space describing the complete state of the entire quantum processor, the quantum tomography procedure becomes extremely inefficient even for systems consisting of several dozen qubits. Modern quantum systems, such as, for example, the superconducting quantum processors Google and IBM, consist of 72 and 50 qubits, respectively. Despite this, an assessment of the quality of transformations performed by a quantum device is a necessary criterion for evaluating the efficiency of a device. One of the methods proposed to reduce the parameter space is quantum learning [1, 2].
The project is being carried out jointly with the center QOT MSU.
|https://www.dropbox.com/s/abj4iorrwe5vvli/Ustyuzhanin_Quantum.pdf?dl=0||python, scikit-learn, pytorch, reinforcement learning|
|Ustyuzhanin A.E.||senior email@example.com||Research||HSE||Reconstruction of quantum processes of modern quantum systems||5||Application of Bayesian training to the problem of estimating the Hamiltonian of a quantum system.|
Quantum simulators, in contrast to the universal quantum computer, are designed to solve highly specialized problems, which can be reduced to a controlled evolution of a system with a given Hamiltonian. In practice, only quantum systems of a specific nature are available for experiment, for example, superconducting qubits, cold atoms, or single photons. In this case, the system itself can be either a real complex structure, for example, a molecule, or an artificial structure that undergoes evolution under the influence of a given Hamiltonian. The task of the experimenter is to create the necessary structure of the available quantum systems. To make sure that the created system works correctly, it is necessary to estimate the x parameters of the evolution Hamiltonian H (x). To do this, you can use the hamiltonian learning algorithm, which allows estimating the parameters of an unknown Hamiltonian [3, 4]. The search for the Hamiltonian is performed using Bayesian inference.
The project is being carried out jointly with the center QOT MSU.
|https://www.dropbox.com/s/abj4iorrwe5vvli/Ustyuzhanin_Quantum.pdf?dl=0||python, scikit-learn, pytorch, bayesian methods|
|Ustyuzhanin A.E.||senior firstname.lastname@example.org||Research||HSE||Measurement of the mass of galactic clusters||5||The project aims to develop an algorithm for analyzing astronomical data describing the location and motion of galaxies. This problem is solved by classical methods in the case of a large number of galaxies. For small clusters, the accuracy of classical algorithms drops sharply and requires an analysis of additional dependencies and finer algorithms. Participants are invited to familiarize themselves with state-of-the-art approaches and compare modern clustering algorithms (dimension reduction) to solve this problem.|
The project is being conducted jointly with researchers from the SAI MSU, the University of Strasbourg, and Harvard.
|Unfortunately, these articles are very technical (a bunch of formulas), because those who write them love formulas, not because they are needed there.|
In any case (there will be further links to NASA ADS - there you need to click on Full Refereed Journal Article PS / PDF - they are all available for free):
1. http://adsabs.harvard.edu/abs/2005ApJ...628L..97D - here is the most concise and succinct description of what it is in section 2. Just half a page. There, the calculation of the density threshold is just described, in which everything rests - smoothing with an adaptive core, etc.
2. http://adsabs.harvard.edu/abs/1999MNRAS.309..610D - here is a complete description of the same. Multiple letters.
3. http://adsabs.harvard.edu/abs/2006AJ....132.1275R - here is an example of applying to SDSS data (we plan to do something similar on a much larger sample)
4. http://adsabs.harvard.edu/abs/1987MNRAS.227....1K - this is the fundamental theoretical article - you can only read the abstract, the article itself - Very Many Buffy Formulas
5. http://adsabs.harvard.edu/abs/2017ApJ...843..128R is a fresh article on simulation, where Fig.1 shows how the galaxy moves on this diagram, if you know 3d-distance (which from observations we do not know).