Oscar Sainz

PhD Student @ University of the Basque Country
Member of Ixa group & HiTZ center

Email: oscar.sainz@ehu.eus

About me

Greetings, I'm Oscar!

My academic journey commenced with the attainment of a Bachelor's degree in Computer Science in 2019, which I swiftly followed with a Master's degree in Natural Language Analysis and Processing in 2020. These milestones were achieved at the University of the Basque Country (UPV/EHU). Currently, I am immersed in the pursuit of a Ph.D. in Natural Language Processing at UPV/EHU. Since 2018, I have been an active member of the IXA research group, working hard on NLP research.

My research revolves around two key areas: low-resource information extraction and data contamination. I'm passionate about finding creative solutions to extract meaningful insights from limited data resources, as seen in my work on topics like zero and few-shot information extraction. Additionally, I tackle the challenges of data contamination, aiming to improve data quality and reliability in Natural Language Processing applications.

Activity

Publications
Students
Other activities

Experience

Grants & Awards
Career

Resources

Slides
Posters
Models

Recent publications

Showing last 10 publications. For more visit my Google Scholar or Semantic Scholar page.

Students

Mikel Zubillaga (2022-2023): BSc student at the University of the Basque Country, co-advised with Oier Lopez de Lacalle.

Other activities

Co-organizer of The First Data Contamination Workshop (CONDA).
4 months visit (September-December) at the LMU Munich with Prof. Dr. Hinrich Schütze.
Active revisor for NLP conferences such as *ACL (ARR), EMNLP, LREC, ... since 2021.

Grants and Awards

Best Reviewer Award at the EMNLP 2023 in the Information Extraction track
Predoc-berri PhD grant by the Basque Government
IKASIKER collaboration grant by the Basque Government

Career

2020-present

Ph.D. in Natural Language Processing

University of the Basque Country (UPV/EHU)
Hitz Center for Language Technologies - Ixa group

2019-2020

M.S. in Language Analyzing and Processing

University of the Basque Country (UPV/EHU)

Grade: 9.26 / 10

2018-2020

Research Internship

Ixa research group
IKASIKER collaboration grant

2015-2019

B.S. in Computer Science

University of the Basque Country (UPV/EHU)

Grade: 8.06 / 10

Slides

GWC 2023

NAACL 2022 (Slides)

EMNLP 2021

Posters

ICLR 2024

EMNLP 2023

SemEval 2023

NAACL 2022 (Demo)

Code

hitz-zentroa/GoLLIE

Guideline following Large Language Model for Information Extraction

osainz59/Ask2Transformers

A Framework for Textual Entailment based Zero Shot text classification

hitz-zentroa/lm-contamination

The LM Contamination Index is a manually created database of contamination evidences for LMs.

osainz59/t5-encoder

A extension of Transformers library to include T5ForSequenceClassification class.