Internship Offers at FCS
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

View only
 
ABCDEFGJKLMNOPQRS
1
Full name of the headDegree and position of the headEmail manager, from which he responds to lettersKind of practicePlace of internshipPlace internship (if other)Internship TopicThe number of students you are willing to take on this topicPractice assignmentDetailed description of the content of the practicePrerequisites (if any)
2
Ratnikov Fedor DmitrievichLeading Researcherfratnikov@hse.ruResearchHSE remotelyDetermination of amplitude-time properties of a signal by Machine Learning methods2https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharinghttps://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problems 1 and 2 ML
3
Ratnikov Fedor DmitrievichSenior Researcherfratnikov@hse.ruResearchHSE remotelyConstruction of a generative model that reproduces the amplitude and temporal properties of the signal1https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problem 3https://docs.google.com/document/d/1SyXq4dsw4FoGZUM3gMKqfBW6_AfPkmNYV1txiC8tmIc/edit?usp=sharing, Problem 3ML, VAE, GAN
4
Ershov Egor IvanovichPh.D., Junior Researchere.i.ershov@gmail.comProgram project (all courses PI and PMI)OtherIITP RASColor imaging conveyor: white balance. (three internships: one algorithm - one student)3Implement one of the classic white balance algorithmsWhen photographing, the mobile phone performs color correction according to the characteristics of the lighting in the scene. This is done so that when viewing the image from the phone in a different place the colors do not seem unnatural. Usually try to find the characteristics of the source in the scene and replace them with the characteristics of the "white" source. This is done programmatically and is called automatic white balance. During these two weeks, you will get basic skills in working with images, understand the differences in image storage formats, and write your own algorithm for automatic white balance.basic python knowledge
5
Ershov Egor IvanovichPh.D., Junior Researchere.i.ershov@gmail.comResearchOtherIITP RASThree-dimensional fast Hough transform: a study of accuracy2To release the construction of dyadic straight lines and planes. Get data on the distribution of their errors. Try to describe the patternThe fast Hough transform is a fast (faster impossible) algorithm for calculating sums over all discrete (specific) straight lines in an image. In our laboratory, we study the specificity of these discrete straight lines, including for the three-dimensional case. On this internship, you will learn a new algorithm, perform a number of new characteristics of the family of these algorithms, with the help of computational experiments and, perhaps, improve your level of knowledge of the MATLAB language. In addition, the problem with a star will be an attempt to analytically describe the observed effects.Matlab
6
 Gurvich Vladimir AleksandrovichPhD, Senior Researchervladimir.gurvich@gmail.comTraining (1, 2 year PMI, 1 year PI)HSEElements of Game Theory5Studying the theory and solving problems on the topics: Combinatorial Games, Nash solvability.Lectures and literature will be sent by e-mail to all participants.Knowledge of the basic definitions of the Theory of Graphs is useful, but not necessary.
7
Yakovlev Konstantin SergeevichPh.D., associate professor, leading researcher kyakovlev@hse.ruProgram project (all courses PI and PMI)HSEAutonomous navigation of mobile agents51. The study of sources devoted to the planning of the trajectory.
2. The choice of the task for the decision and the method of the decision (in coordination with the head):
- trajectory planning in 2D using visibility graphs
- trajectory planning in 2D using Voronoi diagrams
- trajectory planning in 2D using graphs of regular decomposition
- trajectory planning in 2D with kinematic constraints
- trajectory planning in 3D
- trajectory planning for groups of agents
- and etc.


Trajectory planning is one of the fundamental abilities of any intelligent agent (robot, character in a computer game, etc.), so the methods of planning spatial movements have always been (and continues to be) much attention in artificial intelligence. Despite the seeming simplicity, this is a non-trivial (challenging) task, the solution of which requires not only a confident knowledge of mathematical and computer sciences, but also ingenuity.

In the scientific laboratory in which I work, my colleagues and I have been dealing with this interesting topic for several years: we conduct research, publish in scientific journals, and give reports on our new developments at conferences and seminars. I suggest that you also get acquainted with the area of ​​path finding and master the basic algorithms that can be further developed, improved and applied in real systems (navigation services, robotics, etc.).
C / C ++, Eng. lang
8
Zykov Sergey ViktorovichDoctor of Technical Sciences, Professorszykov@hse.ruResearchHSEOverview of modern methods of project risk analysis based on decision-making technologies2Review of modern methods of risk analysis in project development. Analysis of the most common decision-making technologies. Recommendations on the application of the considered methods to the studied problems.
9
Zykov Sergey ViktorovichPh.D., professorszykov@hse.ruResearchHSEExamining the capabilities of software products for project risk analysis1Overview of modern software products used for risk analysis in project development. Development of recommendations on the peculiarities of the use of specific products in various cases with respect to the subject matter.
10
Aizenberg Anton AndreevichPh.D., associate professor ayzenberga@gmail.comResearchHSEConstructing diagrams of orbifold features for the misorientation spaces in polycrystalline materials.2The minimum task to practice: write a program on a python, or whatever, according to my algorithm. The program will receive two discrete subgroups of the multiplicative group of quaternions as input (most likely, the subgroup will simply be given a list of its elements, that is, a set of 4-dimensional vectors, up to 120 pieces). At the output, the program should produce a three-dimensional image depicting a certain set of circles, ideally with some marks. It should be possible to rotate and zoom these three-dimensional images, and in the future - put them in open access.Despite the terrible name, the minimum task for practice: to write a program on a python, or whatever, according to my algorithm. The program will receive two discrete subgroups of the multiplicative group of quaternions as input (most likely, the subgroup will simply be given a list of its elements, that is, a set of 4-dimensional vectors, up to 120 pieces). At the output, the program should produce a three-dimensional image depicting a certain set of circles, ideally with some marks. It should be possible to rotate and zoom these three-dimensional images, and in the future - put them in open access.

The maximum task: to additionally deal with the theory and offer their ideas. To study the concepts of quaternions, the double covering of an orthogonal group by the three-dimensional sphere of quaternions, crystallographic groups and their binary extensions, the formulation and essence of the Poincare hypothesis. In short, explore the basic concepts of low-dimensional topology. And also to understand the concept of the space of misorientations that arises in materials science.
Basics of linear algebra, good geometric imagination. The desire to learn something abstract, like quaternions, undersized topology, or the basics of crystallography
11
Aizenberg Anton AndreevichPh.D., associate professorayzenberga@gmail.comResearchHSENumerical work with the permutohedron volume polynomial and other Voronoi regions2In the minimal version of the summer practice, it will be necessary to test one hypothesis about the polynomial of the volume of a permutohedron. To do this, you need to take a few steps: (1) understand what a permutohedron is, (2) understand what a volume polynomial is, (3) understand how to calculate the volume of a permutohedron. The latter is done under the article by Alex Postnikov (article in English). The article contains several combinatorial formulas - which of them is more convenient to use - this will have to be considered. It will be necessary to verify some identity for the polynomial of the volume of the permutohedron in those dimensions with which the computer can cope. Maximum program: understand the Voronoi weighted diagrams and understand the meaning of the identity, as well as conduct a number of computational experiments for weighted Voronoi diagrams.Article: Alexander Postnikov, Permutohedra, associahedra, and beyond, it will be necessary to selectively read it in the process.Knowledge of linear algebra, general understanding of what a permutation group is. The ability (or the ability to learn) to program on the python (or somewhere else) work with abstract polynomials and rational functions of several variables. The desire to learn a lot about convex polyhedra.
12
Aizenberg Anton AndreevichPh.D., associate professor ayzenberga@gmail.comResearchHSECalculation of the homology characteristics of a dating graph.2Ideally, the following should be done during the practice. (1) Understand the concepts of homology and stable homology, at least one-dimensional. It is possible to disassemble the work in which the homologous stages of the correlation graph of brain regions are built (this work is not a special topological one, and there many things are written relatively understandably). (2) Get familiar with the package that computes (stable) homology and cycles. (3) Spars from somewhere dating graph, for example, students of PCF. (4) Apply an algorithm to this graph and evaluate its topological significance for each edge of the dating graph. (5 *) [this, most likely, does not apply to topological matters, but is generally interesting] Is it possible to understand with great accuracy, according to the dating column of students of the FKN, which student is the training assistant?Article Petri, Expert, ... Homological scaffolds of brain functional
networks, you will need to partially disassemble it and apply a similar technique for analyzing dating graphs
It would be ideal to already know something about algebraic topology in general and homology in particular. If not, the desire to deal with these matters. The ability to quickly get used to new concepts, algorithms and their implementation (for example, on python)
13
Aizenberg Anton AndreevichPh.D., associate professor ayzenberga@gmail.comResearchHSETopology and activity analysis of neuronal loci in the mouse hippocampus3There are several parallel tasks here. The initial setting for everyone is as follows: the mouse runs for several minutes through the "labyrinth," at which time the camera shoots a certain area of ​​its brain, on which individual neurons are visible. According to the video, a table was constructed in which the activity of the i-th neuron (out of 600) was recorded at the j-th instant of time (out of several thousand). The work consists in the construction and analysis of topological objects associated with this table. (1) In the previous work, 20 neurons were selected that are the best candidates for the title of "place neurons". It takes about 10 of them, so that the corresponding connected place fields cover the maze, and see if it is possible, knowing only these 10 neurons, and having no information about the actual position of the mouse, to restore the remaining 10 neurons of the place (you can try to do this as an analysis data, so and combinatorially-topologically, there is one idea, how to approach this). Try to calculate in a similar way the neurons of a place with disconnected place fields. (2) Having all 20 neurons of the site, and a time series of their activity, restore the trajectory of the mouse movement. Compare with the real trajectory - did you guess correctly? (3) It will be necessary to disassemble one of the two articles related to the relevant tasks, and talk about it at the seminar. This task is more educational than research, but only for 3-course students and higher. Articles in English and seemingly difficult. (4) * - with some probability, by the end of July, we will have new data on mice in more complex mazes. It will be necessary to carry out the initial processing of this data, to select the neurons of the place, to describe the corresponding place fields, etc. But this is if the data appear.Articles: introductory, about that in general speech: Curto, Itskov, Cell Groups Reveal Structure of Stimulus Space. Beforehand, it would be good to read on Wikipedia what neurons are in a place, and what is the Alexandrov theorem on nerve covering (Nerve Theorem).

More cool articles on analysis: Erik Rybakkena, Nils Baasa, and Benjamin Dunnb, Decoding of neural data using cohomological feature extraction
and
Rubin, Sheintuch, ..., Revealing neural correlates of behavior without behavioral measurements
You need to have some kind of general understanding in one of the 3 areas: statistical data processing, topology, neurobiology, and the desire to pump the remaining two. This task is difficult, but you will learn many new things both in theory and in practice, and you will be able to work with real data.
14
Aizenberg Anton AndreevichPh.D., associate professor DBDiIPayzenberga@gmail.comResearchHSESoftware implementation of zigzag-resistance and homologous scaffolding in fMRI data3In topology, there is the classic concept of homology groups and Betty numbers. In applied topology, which has been around for about 15 years, there is the concept of stable homology - it allows you to analyze the topology of spaces by statistical methods. The theory of stable homology is based on the classification of modules over a ring of polynomials - a classic story in higher algebra. Relatively recently, the concept of zigzag-stability has appeared, which allows exploring dynamically changing graphs and hypergraphs using topology and statistics methods. The practice offers several tasks. (1) Theory: to study (if necessary) what stable homologies are, and also to analyze Carlson's article on zigzag-stability. (2) Theory / practice: deal with how zigzag-stability computation is implemented in the Python package dyonisus, working with stable homology. (3) Practice: calculate zigzag-resistant diagrams for correlation graphs obtained according to fMRI, describe homologous scaffolding. The last task is the most difficult, because in some way it involves understanding the essence of what is happening. But it is also the most interesting.The book about homology and stable homology Oudot Persistent homology (although it is better to start homology with something more basic, for example, with Wikipedia).

Article on the zigzag-resilience of Carlsson, de Silva, Morozov, Zigzag Persistent Homology and Real-valued Functions

An article on homologous scaffolding in correlation graphs obtained by fMRI: Petri, Expert, ..., Homological scaffolds of brain functional networks
Ideally, for item 1, knowledge of the fundamentals of homological algebra. Otherwise, good knowledge of linear algebra, lack of fear of large diagrams in vector spaces and linear mappings between them. For p.2-3 - the ability to program well, and to deal with the work of the new Python packages. For p.3 desire to work with real brain data.
15
Aizenberg Anton AndreevichPh.D., associate professorayzenberga@gmail.comProgram project (all courses PI and PMI)HSENumerical work with cobordism classes of toric varieties1Despite the abundance of incomprehensible words, the work will be clear and accessible, including to students, who have completed the first year with good grades. It will be necessary to make computer calculations on a set of algorithms which I have. The following describes the calculations, and in brackets their meaning (in fact, the mathematical motivation of work, to learn how to understand it would be cool, but not necessary).
1) According to a set of combinatorial data (a fan of a toric variety), construct a polynomial in several variables (a polynomial of the volume of a toric variety).
2) Calculate a specific set of partial derivatives of the resulting polynomial (counting the characteristic numbers of a toric variety).
3) Perform these calculations for a set of initial data (products of projective spaces, that is, the basis of the complex cobordism ring, as well as for permutohedral manifolds in small dimensions).
4) Solve a system of linear equations depending on the numbers found (express a permutohedral manifold in terms of the cobordism rings).
5) Check the coincidence of some numbers (check the Buchstaber hypothesis about the relationship between the structure of the exponent of the formal group of complex cobordism and permutohedral varieties - most likely it is not true, but difficult to check)
6) Provide sources so that in step (3) I can independently change the input data.
I think that I will write a detailed memo describing all the algorithms and calculations that need to be implementedGood knowledge of all subjects that are studied in the 1st year of PMI. The ability to program all sorts of abstract things, like rational numbers, polynomials, partial derivatives.
16
Osipov Dmitry SergeevichPh.D. assistant professord_osipov@iitp.ruResearchOtherIITP RASError Correcting Cryptosystem21. To study a cryptosystem on the basis of a jam-resistant code described in the work of Esmaeli-Gulliver 2. To execute the software implementation of one of the variants of such a cryptosystemThe task of the practice is to familiarize students with the methods of the theory of codes that correct errors, and the prospects for their use in post-quantum cryptography.Knowledge of English in a volume that allows you to read and understand scientific texts. Basic knowledge of linear algebra and programming in Matlab
17
Osipov Dmitry SergeevichPh.D., senior researcherd_osipov@iitp.ruResearchOtherIITP RASMethods of coding theory in data storage21. Acquaintance with coding theory methods and their use in data storage systems 2. Software implementation of code design for data storageThe task of the practice is to acquaint students with the basic concepts of coding theory and coding theory methods used in modern data storage systemsKnowledge of English in a volume that allows you to read and understand scientific texts. Basic knowledge of linear algebra and programming in Matlab
18
Rygaev Ivan PetrovichResearcherirygaev@gmail.comProgram project (all courses PI and PMI)OtherLaboratory of Computer Linguistics IITP RASBackward chaining reasoning1Develop an algorithm for the inverse of inference.In the linguistic processor ETAP-3 a system of semantic text analysis is being developed, which allows making automatic conclusions based on the processed text. The text is converted into a semantic (logical) representation, which is stored in the database. Then the rules of logical inference (implications of the "if-then" type) are applied to this view, which replenish the database with the conclusions drawn. Then in such an extended structure, you can search for answers to questions for which there are no explicit answers in the text.
For example, having received the sentence “Vasya bought a car” at the entrance, the withdrawal rules supplement the database with knowledge corresponding to the sentences “Vasya got the car”, “Vasya gave the money”, “Vasya has a car”, etc. By asking the question “Vasya has a car? " we get the answer "Yes", because the answer is contained in the extended structure. This is a direct inference algorithm.
The reverse logic inference algorithm should work the other way around, not from data, but from a request. Instead of expanding the data, it expands the query so that the necessary answer can be obtained from the basic (non-extended) data structure. Implications are processed in the opposite direction - from "that" to "if". In our example, having received the question "Vasya has a car?" the system must first search for it in the base structure. Not having received the answer, should look in the implications, what other events may lead to "Vasya has a car"? Such events will be including "Vasya got the car" and "Vasya bought the car." Thus, the query expands: "Vasya has a car? Or did Vasya receive the car? Or did Vasya buy the car?" Each of these subqueries is looked for separately in the base structure, and we receive a positive response to the third subquery, and therefore to the original query too.
The student will be asked to develop an algorithm for the inverse of inference in the system ETAP.
19
Rygaev Ivan PetrovichResearcherirygaev@gmail.comProgram project (all courses PI and PMI)OtherLaboratory of Computer Linguistics IITP RASLexical game2Develop a web interface for the game of guessing lexical functionsIn a language, some meanings are expressed in different words depending on the word with which the source is associated. For example, the meaning of "a large degree of something" in Russian is expressed as: if we are talking about rain, then we will say pouring, if it is a disease - severe, if fog is thick, etc. This is called the lexical function f (x ) = y. In this example, this function is called MAGN: MAGN (rain) = pouring, MAGN (disease) = severe, MAGN (fog) = thick, etc. There are several dozen of such functions. Other examples include the noun - adjective (f (squirrel) = squirrel, f (crow) = crow), private - general (f (butterfly) = insect, f (dove) = bird), animal - the sound produced (f ( cat) = meow, f (dog) = barks), etc.
We plan to make a game for guessing lexical functions. In it, the user is shown a pair (x, y) for a certain lexical function. The function itself is not specified. And it is proposed to guess y for another x, meaning the same function. For example: "the cat meows, and the dog -?". The prototype of the game can be viewed at http://jent.ru/lexy/
The student will be invited to participate in writing either the user web interface for the game, or its server part. We are talking about software mechanisms and algorithms. Content (words with lexical functions) will be provided to the student.
20
Makarov Sergey Lvovichassociate professor tech. of sciencesmakarov@hse.ruProgram project (all courses PI and PMI)HSEA program that implements the method of Veitch diagrams1The study of the method of Veitch diagrams
Search and analysis of analogues
Development of a program / application (desktop, web, mobile - not important)
Preparation of a practice report
You can just google: https://vuzlit.ru/899465/metod_diagramm_veycha
The program should be implemented either using recursion or in the usual way, but it should be able to minimize Boolean functions and produce results for functions from 2 to 10 arguments, while the result should be in human form, ie, for example, abc + c (not) de, where not is a negation, standing on top of the letter d in the form of a dash.
Programming
21
Makarov Sergey Lvovichassociate professor tech. of sciencesmakarov@hse.ruProgram project (all courses PI and PMI)HSEInternet of Things smart street lamp1Develop an electrical circuit
Learn to work with Raspberry Pi 3 and sensors
Learn to work with Android Things
Learn to work with any cloud, for example Firebase or Artik Cloud
Develop a program / application (web, desktop, mobile), which will display the cycles on and off the lamp
The scheme includes the Raspberry Pi 3, a motion sensor, some kind of flashlight or lamp, a relay. When motion is detected, the sensor sends a signal to the raspberry, it switches the relay and the lamp lights up, while the data on how much the lamp was on, when it turned on and when it turned off are stored in the cloud and accessible via the application (where you can plot the activity and calculate energy (watts) was spent on lighting for a period of time)Programming
22
Makarov Sergey Lvovichassociate professor tech. of sciencesmakarov@hse.ruProgram project (all courses PI and PMI)HSEUsing analytics tools in the finished application1Learn to use various tools for application analytics (minimum 3).
Embed tools into the finished application
Get data analysts for 2 weeks and compare them, draw conclusions
Write a practice report
There is a ready application (desktop / mobile / web). We take several tools for analytics (Flurry, Crashlytics, Mixpanel, Yandex.Metrica, Google Analytics, Firebase Analytics, etc.) - at least 3, and implement it in the application, and then analyze the results and draw conclusions - which tool is better worse, what's the difference, what do tools allow you to do, what parameters do you analyze (retention rate, session length, number of installations, number of users, LTV, number of crashes, etc.)Programming
23
Kreshchuk Alexey AndreevichPh.D., senior researcher IITP RASkrsch@iitp.ruProgram project (all courses PI and PMI)OtherIITP RASDevelopment of a library for working with end fields in C ++ 172Finite fields are often used in cryptography and coding theory. Although there are many libraries that implement operations in finite fields, most of them do not develop, and the rest do not have sufficient flexibility. In this project it is proposed to develop a library for working with end-fields of characteristic 2. Main subtasks:
1. Develop a convenient API, which makes it possible to use both static typing and type erasure.
2. Investigate various ways of representing finite fields and determine the effectiveness of their use for fields of different order.
3. Develop ways to store elements of finite fields, combining low memory consumption and high speed.
Finite fields are the basis for all modern algebraic coding theory and public key cryptography. Despite the simplicity and elegance of the mathematical description, the practical implementation of operations on their elements is not so trivial. From various representations of a finite field as a chain of extensions of a trivial (here a Boolean) field, various multiplication algorithms and different ways of storing elements follow. To create a universal library, it is necessary to create a constructor from various algorithms and to provide the user with the opportunity to choose the presentation most suitable for his case. In addition, many algorithms require precalculations, which creates additional difficulties in storing references to the result of these precalculations. All this makes the implementation of the library for working with finite fields an interesting task from the point of view of both design and implementation.Requires programming skills in C ++
24
Kreshchuk Alexey AndreevichPh.D., senior researcher IITP RASkrsch@iitp.ruProgram project (all courses PI and PMI)OtherIITP RASDevelopment of a library for receiving and transmitting signals in Pluto SDR in C ++ 171ADALM Pluto is a modern platform for transmitting and receiving arbitrary signals from computer to computer, intended primarily for learning. The manufacturer provides an API for C, but not for C ++. In this project, it is proposed to create a convenient API that will later be used to teach students Digital Signal Processing. Main subtasks:
1. Develop a convenient API for working with software-defined Pluto SDR radio in C ++ 17 (or C ++ 2a).
2. Implement this API using the existing C language API.
3. Write a test application that transmits and receives an analog signal.
Currently, APIs for C and Python languages, as well as blocks for Simulink, GNU Radio, and LabView systems are offered for working with Pluto SDR. As practice has shown, using Python for signal processing is not very convenient, as it is not well suited for real-time applications. The C interface is at the same time lower level. For more convenient work with Pluto SDR, you need to develop an interface that combines the advantages of an API for C and Python and uses all the features of C ++ 17 to create the most convenient abstractions.Requires programming skills in C ++
25
Posypkin Mikhail AnatolyevichDoctor of Physics and Mathematics, professormposypkin@hse.ruResearchOtherFIZ IU RAS (Vavilova, 40)Experimental study of the effectiveness of optimization algorithms4Develop a methodology for experimental comparison of various optimization algorithms and make such a comparison.When comparing the effectiveness of optimization algorithms, many difficulties arise. First, it is the diversity of comparison criteria - the speed of work, robustness, accuracy of finding the minimum. Secondly, it is a dependence on the function being optimized, as well as on the initial approximation. It requires the development of effective methods for comparing functions, developing instrumental support for comparing in the form of scripts in python, modules for visualizing results, planning and conducting a computational experiment.
26
Pantyukhin Dmitry ValerievichSenior Lecturerdapntiukhin@hse.ruProgram project (all courses PI and PMI)HSEDevelopment of the assistant program secretary of the Commission on the Protection of the Kyrgyz Republic / WRC1Develop a program to assist the Secretary of the Commission on the Protection of the Kyrgyz Republic \ WRC, which has the following functionality:
1) connection to a certain discipline of the LMS (for reading) under the secretary’s account
2) uploading students' materials ("project") presented as an archive (zip, rar), saving to disk
3) Analysis of the content of each project in order to determine the completeness of the files contained (documentation, presentation, manager's review, etc.)
4) output to the screen \ save to xls file and print a pivot table with the results of the analysis.
The program should:
- allow to specify the discipline LMS,
- keep the password protected;
- work in conditions of poorly structured archives
- allow viewing the contents of downloaded files
- allow manual editing of table fields.
An additional requirement is the analysis of the manager's response, represented by the scan-image and recognition of the set assessment.
The student will receive a fat plus sign in karma and thanks to the secretaries of the commissions for facilitating their hard work.
27
Rygaev Ivan PetrovichResearcherirygaev@gmail.comProgram project (all courses PI and PMI)OtherLaboratory of Computer Linguistics IITP RASSynTagRus in Universal Dependencies format2Develop a script for converting syntactically sized SynTagRus package into Universal Dependencies formatSynTagRus is currently the largest corpus (collection of texts) in Russian with syntax markup. Not only parts of speech and morphological indicators of words (such as gender, case, time, mood, etc.), but also syntactic relations between words (subject, addition, definition, etc.) are marked in the corpus. The set of syntactic relations and morphological characteristics is based on the Igor Melchuk's Meaning-Text theory and corresponds to those used in the ETAP linguistic processor. Recently, however, the Universal Dependencies format (www.universaldependencies.org) has become the de facto standard for syntactic markup. Therefore, the problem arises of converting the SynTagRus package into this format. This task has already been solved once, however, the syntactic relations specific to the Russian language that are present in the corpus have been lost. They were replaced by universal ones, which led to the unification of several types of relationships into one. It is required to re-convert with preservation of all the Russian-specific features of SynTagRus.
The student will be asked to write a script (or part of it) on Python or another language to convert the SynTagRus package into the Universal Dependencies format. A mapping of syntactic relations and morphological characteristics will be provided.
28
Rygaea Ivan PetrovichResearcherirygaev@gmail.comProgram project (all courses PI and PMI)OtherLaboratory of Computer Linguistics IITP RASRegression test for semETAP semantic analyzer1Develop a program for comparing new semantic structures with old ones with a display of discrepanciesSemETAP semantic analyzer translates the original sentence in Russian into its semantic representation in the form of a semantic graph or text in a formal, formal language, Etoalog, which reflects the meaning of the sentence. This representation is formed using the rules for converting the syntactic structure into the semantic one and the rules of inference. These rules are constantly constantly being supplemented and changed in order to support the processing of new words and new phenomena of the language. Unfortunately, in addition to the positive impact of these changes can lead to negative effects, and in unexpected places. To keep track of what exactly made the changes, you need to implement a regression test. This test is a set of sentences, each of which is associated with a reference semantic structure. When a test is launched, each sentence is subjected to semantic analysis according to new rules with the receipt of new semantic structures. Further, these structures should be compared with the reference ones with the display of discrepancies on the screen (by analogy with the version control systems).
The student will be asked to write a program to compare the text of the new semantic structures with the old with the display of discrepancies on the screen
29
Irina A. Lomazova and Julio C. CarrasquelTenured Professor and the Research Assistantjcarrasquel@hse.ruResearchHSETo be arranged with the studentAnalysis of Trading Systems with Process Mining1Analysis of Trading Systems using Process Data Mining: Extracting Knowledge about Processes from Event DataThe London Stock Exchange (LSE) system is an important trading venue for buying / selling orders (company stocks). It can be used to make it very complex. We are interested in those orders.

You will receive information from the LSE System. During this training.
Basic skills on any programming language (i.e. Java, Python).
30
Bauwens Bruno FrederikAssistant professorbrbauwens@gmail.comResearchHSEAnalysis of the game used to separate the monotone Kolmogorov complexity and algorithmic probability2Winning strategy in a game. The existence of a winning strategy is equivalent to an open question. We try strategies for small values. Students can choose the game mathematically.https://www.dropbox.com/s/j5kvherrtmn69jx/monotoneCompl_vs_AprioriProb.pdf?dl=0
31
Bauwens Bruno FrederikAssistant professorbrbauwens@gmail.comResearchHSEKolmogorov complexity and foundations of learning theory3Studying Kolmogorov complexity, applications in inductive inferrence, solving various exercises from books.
32
Bauwens Bruno FrederikAssistant professorbrbauwens@gmail.comTraining (1, 2 year PMI, 1 year PI)HSEQuantum computation2Learn the basic quantum computationhttps://homepages.cwi.nl/~rdewolf/qcnotes.pdf
33
Parinov Andrey Andreevichjunior researcher, senior teacheraparinov@hse.ruProgram project (all courses PI and PMI)HSEISSA NUMBERSCollection and analysis of data on entering universities in Russia5
It is required to write programs for collecting data from the websites of the universities of Moscow on those entering the universities.
 Knowledge of programming language Python 3.
34
Shapoval Alexander Borisovich
professor
abshapoval@gmail.com
Training (1, 2 year PMI, 1 year PI)
HSEApplication of the principal component method to the analysis of the effectiveness of political elites3
Examine the principal component method.
"Clear" the original data.
Apply the principal component method to the data.
Write a report.
Interest in mathematical analysis and political economy
35
Derkach D.A.senior researcherdderkach@hse.ruResearchHSEsearch for supernovae5One of the urgent tasks of astronomy is the classification of various types of objects according to their light curves, that is, the dependence of their brightness on time. The peculiarities of such astronomical time series are their large heterogeneity and the unknown moment of the beginning of the report. The heterogeneity lies in the fact that between the observations of the same object a different time passes, and sometimes quite a lot. And the beginning of the report is unknown because it is difficult to understand exactly when a certain astronomical event began to occur, observing only a part of it.
 
 You are invited to explore the possibilities of machine learning methods for classifying some of the most powerful explosions in the Universe - supernovae. You have to get acquainted with several algorithms and improve them.
Stages of work:
 1) Study of the data, plotting the curves of the light curves
 2) Implementation of the “naive” algorithm based on the method of extracting features used by experts, bringing the accuracy of the binary classification to 0.6.
 3) Study and implementation of one of the proposed algorithms presented in the literature or in completed Kaggle competitions.
 4) Improving the accuracy of binary classification to 0.9
python, scikit-learn, pytorch
36
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruProgram project (all courses PI and PMI)HSEdevelopment of a bot for analyzing the operation of AI algorithms5This project will develop bots that simulate simple user actions for several Internet resources (google, yandex, bing, yahoo). Such a bot will allow to quantify the differences in the output of materials depending on the history of user behavior.Measuring the degree of distortion (community polarization) could form the basis for building a global polarization index (by countries and systems). Such a summary is very important both to the companies themselves as feedback and to the community to assess the influence of algorithms on the formation of trends and opinions.
 
 The first step in work on impact assessment is to quantify differences depending on the strategy of user behavior. This project aims to collect primary statistics for such a study.
 
 The project is conducted in collaboration with the Faculty of Sociology.
 
 Related Links:
 
 https://medium.com/@sergey_57776/bitv- gaz-earth- which- may- be- played- e3bc5cad5e63,
 http://mlg.eng.cam.ac.uk/adrian/
python, experience with the packages of parsing internet resources
37
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSERestoration of particle properties from the tracks from the Time Projection Chamber detectors5Many neutrino detectors use a special type of sensitive element - the Time Projection Chamber, in which charged particles drift under the action of an electro-magnetic field. The specificity of such detectors lies in the fact that the delay in the registration of a signal is related to the distance traveled by the particle. Reconstruction algorithms for such detectors work well in conditions of low particle density. In the course of practical work, it is necessary to develop an algorithm for reconstructing the configuration of a high density initial event from the detector signals as much as possible.python, scikit-learn, pytorch
38
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSERecovery of 3D structure by X-ray images5Restoration of the 3D structure of complex objects is an urgent task for biologists, chemists, and solid state physicists. Modern accelerator facilities allow radiation to be shown to objects of interest with certain characteristics. In the process of shooting, the object rotates, leaving a series of digital projections. For this series of tomograms, it is necessary to restore the bulk density map of the original object.
 In the course of this project, it is necessary to learn to determine the base characteristic - the center of rotation of the original object.
Data provided by the XFEL experiment. The project is conducted together with representatives of the experiment.python, scikit-learn, pytorch
39
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSESearch for extended galactic objects5The project aims to develop an algorithm to search for extended galactic objects in data collected by the FERMI telescope. A weak signal, a high level of instrumental noise and the presence of a galactic background are key detection difficulties. The search for such sources is one of the central tasks of the FERMI collaboration and the astronomical community as a whole. Of greatest interest are regions close to the galactic center, where the distribution of the density of the galactic background is an open question. In the course of the project, it is necessary to become familiar with the current algorithms for separating the signal from the background and developing / improving them to achieve the required level of accuracy.Data provided by the FERMI experimentpython, scikit-learn, pytorch
40
Borisyak M.A.junior researchermborisyak@hse.ruResearchHSEDevelopment of gradient boosting algorithm for the detection of anomalies5Anomaly detection is one of the important tasks of data analysis. Its use can be found both in protection against cyber attacks or in the detection of suspicious transactions, and in building reliable production quality control systems. Classical gradient-boosting algorithms for trees work well in the case of a balanced selection of shared classes. Additional mechanisms are usually used to detect anomalies (rare events). One of them may be the artificial generation of rare events based on a) the available data and b) the work of the pre-trained model. The project will recall the principle of operation of gradient boosting algorithms over trees and become familiar with the approach developed in our laboratory to adapt neural networks to detect anomalies and expand it to work with trees.https://www.dropbox.com/s/kxri6cxq9yb25ga/Borisyak_Boosting_for_anomaly_detection.pdf?dl=0python, scikit-learn, pytorch
41
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSEReconstruction of the quantum states of modern quantum processors5This project will have to learn how to effectively evaluate the state of an n-dimensional quantum system with modern methods of machine learning.
 The procedure for determining the state parameters of a quantum system from measurement data is an integral part of experiments to create a quantum computer. Algorithms of quantum tomography of a state allow performing the procedure of reconstruction of complex parameters of a quantum state, using measurement data. The number of parameters in this case (the dimension of the Hilbert space) grows exponentially with increasing dimension (the number of qubits) as 2 ^ n. Due to such a rapid growth in the dimension of the parameter space describing the complete state of the entire quantum processor, the quantum tomography procedure becomes extremely inefficient even for systems consisting of several dozen qubits. Modern quantum systems, such as, for example, the superconducting quantum processors Google and IBM, consist of 72 and 50 qubits, respectively. Despite this, an assessment of the quality of transformations performed by a quantum device is a necessary criterion for evaluating the efficiency of a device. One of the methods proposed to reduce the parameter space is quantum learning [1, 2].
 The project is being carried out jointly with the center QOT MSU.
https://www.dropbox.com/s/abj4iorrwe5vvli/Ustyuzhanin_Quantum.pdf?dl=0python, scikit-learn, pytorch, reinforcement learning
42
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSEReconstruction of quantum processes of modern quantum systems5Application of Bayesian training to the problem of estimating the Hamiltonian of a quantum system.
 Quantum simulators, in contrast to the universal quantum computer, are designed to solve highly specialized problems, which can be reduced to a controlled evolution of a system with a given Hamiltonian. In practice, only quantum systems of a specific nature are available for experiment, for example, superconducting qubits, cold atoms, or single photons. In this case, the system itself can be either a real complex structure, for example, a molecule, or an artificial structure that undergoes evolution under the influence of a given Hamiltonian. The task of the experimenter is to create the necessary structure of the available quantum systems. To make sure that the created system works correctly, it is necessary to estimate the x parameters of the evolution Hamiltonian H (x). To do this, you can use the hamiltonian learning algorithm, which allows estimating the parameters of an unknown Hamiltonian [3, 4]. The search for the Hamiltonian is performed using Bayesian inference.
 The project is being carried out jointly with the center QOT MSU.
https://www.dropbox.com/s/abj4iorrwe5vvli/Ustyuzhanin_Quantum.pdf?dl=0python, scikit-learn, pytorch, bayesian methods
43
Ustyuzhanin A.E.senior researcheraustyuzhanin@hse.ruResearchHSEMeasurement of the mass of galactic clusters5The project aims to develop an algorithm for analyzing astronomical data describing the location and motion of galaxies. This problem is solved by classical methods in the case of a large number of galaxies. For small clusters, the accuracy of classical algorithms drops sharply and requires an analysis of additional dependencies and finer algorithms. Participants are invited to familiarize themselves with state-of-the-art approaches and compare modern clustering algorithms (dimension reduction) to solve this problem.
 
 The project is being conducted jointly with researchers from the SAI MSU, the University of Strasbourg, and Harvard.
Unfortunately, these articles are very technical (a bunch of formulas), because those who write them love formulas, not because they are needed there.
 
 In any case (there will be further links to NASA ADS - there you need to click on Full Refereed Journal Article PS / PDF - they are all available for free):
 1. http://adsabs.harvard.edu/abs/2005ApJ...628L..97D - here is the most concise and succinct description of what it is in section 2. Just half a page. There, the calculation of the density threshold is just described, in which everything rests - smoothing with an adaptive core, etc.
 2. http://adsabs.harvard.edu/abs/1999MNRAS.309..610D - here is a complete description of the same. Multiple letters.
 3. http://adsabs.harvard.edu/abs/2006AJ....132.1275R - here is an example of applying to SDSS data (we plan to do something similar on a much larger sample)
 4. http://adsabs.harvard.edu/abs/1987MNRAS.227....1K - this is the fundamental theoretical article - you can only read the abstract, the article itself - Very Many Buffy Formulas
 5. http://adsabs.harvard.edu/abs/2017ApJ...843..128R is a fresh article on simulation, where Fig.1 shows how the galaxy moves on this diagram, if you know 3d-distance (which from observations we do not know).
python scikit-learn
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...