Data manager

Job description

Data is of paramount importance as it accelerates the development of artificial intelligence technology. For example, ImageNet in computer vision or SQuAD in natural language processing has had a huge impact on the artificial intelligence community and received a huge amount of attention. Also, high-quality data results in high-quality models and offers meaningful assessments. For these reasons, the construction of data is covered in Resources tracks of prestigious conferences such as *ACL, EMNLP.

As a Data Management Intern at Upstage, you will be part of the AI tech team to work on a variety of tasks necessary to build data for NLP or CV. You will design the conditions of the data in order to effectively train and evaluate models, methods to efficiently conduct data annotations. This is an essential capability for practitioners in AI and will benefit both your career and the community. You can also experience a variety of machine learning tasks, as you will be able to build multiple types of data.


  • Design, gather, manage large scale data for machine learning training and evaluation
  • Create and/or review data annotation guidelines
  • Serve as a key liaison with external data enrichment vendors for ongoing data enhancement and augmentation efforts

Specialty areas:

  • Natural language process
  • Computer vision

Recruitment details:

  • Position type: Full-time, Internship
  • Recruitment process: Resume screening → Online interview → Offer

Job requirements

Required qualifications:

  • Understanding of basic machine learning tasks (e.g., sentiment analysis)
  • Experience in data acquisition
  • Experience with image or text related pre/post-processing libraries (e.g., OpenCV)
  • Effective negotiation and influencing skills

Preferred qualifications:

  • Experience in modeling any machine learning task
  • Experience in data engineering
  • Experience in writing data annotation guidelines for machine learning
  • Experience in communicating any data annotation guides to annotators
  • Understanding of evaluation metrics for machine learning tasks