QED (https://qed.ai) is a tech company focused on public health and food security in Sub-Saharan Africa. We build the digital infrastructure and AI used at the intersection of aid and scientific inquiry, including in-situ nutrient analysis of crops and soils and foods, and surveillance of HIV and malaria at national-scale in multiple African countries. Our funding comes from philanthropic and governmental organizations, such as the Gates Foundation, the Polish Ministry of Agriculture, EU funding programmes, the Global Fund, and others.
We are looking for a Data Scientist to join our team in Warsaw, Poland working on ScanSpectrum (https://scanspectrum.qed.ai), a low-cost and handheld spectrophotometry kit being used to assess quality parameters of food and agricultural products.
Build and improve regression models for spectroscopic data (e.g., PLSR, SVR, Extra Trees)
Plan sampling strategies and real-world experimental design with a strong statistical mindset (representativeness, power, coverage across varieties/regions/seasons/batches).
Design robust training/evaluation pipelines: preprocessing, feature engineering, hyperparameter tuning, and proper cross-validation.
Apply chemometrics best practices to spectral data (baseline/scatter effects, derivatives, outliers, drift).
Diagnose model failures and data issues (instrument effects, label noise, dataset shift) and propose fixes.
Construct dashboards/reports to present and visualize data analytics.
Collaborate with engineering/product to package and deploy models (model export, versioning, monitoring metrics).
Assist in effective communication with scientific institutes, companies, or researchers that QED currently collaborates with or wishes to collaborate with.
Collaborate with governmental, medical, agronomic, and computer science teams in the composition of research papers and impact reports.
Proficiency with core ideas in statistics.
Formal academic studies in statistics, data science, or software engineering, coupled with some practical experience with analyzing real-world datasets.
Ability to use computer programming to wrangle, inspect, and analyze statistical data.
Proficiency with git and Python-based programming environments, preferably with CI/CD.
Practical proficiency in analyzing substantial real-world datasets with
traditional statistical techniques, such as regression, hypothesis testing, survey design, time series, and RCTs, and also
with modern machine learning methods, such as decision trees, boosting and neural networks.
Tenacity and curiosity to acquire domain expertise as needed, enabling rigorous contextualization of statistical challenges within established scientific frameworks (e.g., biology, chemistry, agronomy).
Working proficiency (≥C1) in speaking and reading English, and capable of typing English with a speed of at least ≥45 words per minute.
Logical reasoning and ability to express oneself clearly, both orally and in writing.
Willingness and interest in working with people from other cultures. Emotional resilience and social intelligence to communicate and work with collaborators from around the world, including Europe, Africa, Asia, and the USA.
Don’t be afraid of getting your hands dirty.
… and you have to care about the work that you do!
Below are additional skills that are a bonus, but are not required:
Prior experience with spectroscopy (Vis/NIR/MIR) and chemometrics in practice.
Understanding of spectroscopy physics (e.g., overtones/combination bands, scattering, path length effects).
Experience with calibration transfer, drift monitoring, or multi-instrument datasets.
Domain familiarity: food chemistry, agriculture, grain quality, soil/plant analysis, lab reference methods.
Domain knowledge and/or interest in the sustainable development goals, particularly in agriculture, climate change, public health, and/or assisting developing countries.