Topics for Projects
Giuseppe Attardi
Human Language Technologies
Dipartimento di Informatica
Università di Pisa
Università di Pisa
Project Types
Question Generation
QA with RLHF
Legal Judgement Predictor
Challenges
Key Point Analysis
New NLP Task: Given an input corpus, consisting of a collection of relatively short, opinionated texts focused on a topic of interest, the goal of KPA is to produce a succinct list of the most prominent key-points in the input corpus, along with their relative prevalence. Thus, the output of KPA is a bullet-like summary, with an important quantitative angle and an associated well-defined evaluation framework.
Applications: to gain better insights from public opinions as expressed in social media, surveys, parliamentary debates, etc.
Challenge: https://github.com/ibm/KPA_2021_shared_task
Tracks:
KPA Example
Key point | Matched arguments count |
Mainstream schools are essential to develop social skills. | 61 |
Parents are not qualified as teachers. | 20 |
Homeschools cannot be regulated/standardized. | 15 |
Mainstream schools are of higher educational quality. | 9 |
Obtained by human labeling on key points provided by an expert, on the topic "Homeschooling should be banned", on the pro stance arguments.
KPA Example
Argument | Matching key point |
children can not learn to interact with their peers when taught at home | Mainstream schools are essential to develop social skills |
homeschooling a child denies them valuable lifeskills, particularly interaction with their own age group and all experiences stemming from this. | |
to homeschool is in one way giving a child an immersive educational experience, but not giving them the social skills and cooperative skills they need throughout life, so should be banned. | |
parents are usually not qualified to provide a suitable curriculum for their children. additionally, children are not exposed to the real world. | Parents are not qualified as teachers |
it is impossible to ensure that homeschooled children are being taught properly | Homeschools cannot be regulated/standardized. |
Track 1 - Key-Point Matching
Input:
Arguments and expert key points for topic and stance in the test dataset. The input consist of three files:
Output :
For each argument, its match score for each of the key points under the same topic and in the same stance towards the topic.
{"arg_15_0": {"kp_15_0": 0.8282181024551392, "kp_15_2": 0.9438725709915161},� "arg_15_1": {"kp_15_0": 0.9994438290596008, "kp_15_2":0}}
Text Generation from Structured Data
Who is the Italian prime minister?
[BlenderBot]: It is Mussolini. He was the leader of the fascist party during WWII.
Is he still alive?
[BlenderBot]: Yes, he is still alive. He has been in politics since 1989. He also has a son.
Text Generation from Structured Data
Example (WebNLG)
D= Aarhus | leaderName | Jacob Bundsgaard�Linearized D= <subject> Aarhus <predicate> leader name <object> Jacob Bundsgaard
T= The leader of Aarhus is Jacob Bundsgaard.
Prompt Tuning
BERTology
Chatbot
convai.io/
IWPT Shared Task
CoNLL 2018: Deep Learning Tokenizer
Evalita 2016-2023
Evalita 2023
Evalita 2023
Question Answering Tasks
https://towardsdatascience.com/nlp-building-a-question-answering-model-ed0529a68c54
http://movieqa.cs.toronto.edu/home/
http://2018.nliwod.org/challenge
Chatbots
Neural Machine Translation
Deep Learning for Sentiment Analysis
Medical texts
Negation/Speculation Scope
Relation Extraction
Fake News Detection
Dataset Collections