David Wallin

Work Experience

Data Scientist

2018 - Present

Nordnet Bank AB

As the Data Scientist in the Analytics team, I deliver insights to various stakeholders within the organisation, as well as tools that help our customers become better investors. This work can take different shapes and forms, such as building recommender systems using matrix factorisation models, by helping with campaign evaluation using Causal Impact modeling, or performing significance testing of marketing campaigns.
I have implemented regression and classification models for different tasks, such as budget allocation. These models are often developed (and built) using frameworks such as scikit-learn, CatBoost, XGBoost and Keras.
During a platform upgrade, I helped evaluate the expected performance of the system using prophecy forecasting.
I developed a conversion tool to transform files using the closed Qlikview data format (QVD) to the open Parquet format, to ease migration tasks.
I have been part of the Non-Maturity Deposit risk modeling team.
During this time, I have also worked with the Google Cloud Platform, where I have built ETL pipelines using, among other components: Cloud Functions, BigQuery, AI Notebooks (Jupyter), Cloud Composer (Apache Airflow), Cloud DataFlow (Apache Beam). Increasingly, Terraform has been used to keep our growing cloud infrastructure in committable code form.

Data Engineer (Consultant)

2018 - 2018

H&M

As part of the Exploratory Analytics team, I worked on building a platform to enable Data Scientists and Analysts by giving them access to data from various sources in the cloud platform Azure.
The service was built, using various components, such as: Azure Functions, Azure Data Factory, Azure Data Lake Analytics and Azure SQL Data Warehouse. Development consisting mostly of SQL (T-SQL, U-SQL with C#/LINQ and Databricks/Spark SQL), but also some programming in NodeJS.

Data Scientist (Consultant)

2017 - 2018

Ivbar AB

The work consisted of building prediction models relating to patient admission and discharge data at two acute medical units in Sweden. Using Python and various machine learning frameworks, such as scikit-learn, XGBoost, CatBoost and Keras, we built models using different algorithms, from linear regression and k-nearest neighbour to gradient boosting and recurrent neural networks.

Lead in Big Data Analysis and Visualisation (Consultant)

2017 - 2017

Tobii AB

Part of a team working with cloud in the Adtech field, developing a new metric ("Seen") within digital marketing funnels.
Responsible for setting up a scalable solution for analysis of user behavioural data for reporting and visualisation, with GDPR compliancy in mind.
Development in AWS, using SQL, Python and NodeJS, and AWS services: Kinesis, Athena and Lambda. Development followed a "GitOps/infrastructure-as-code" methodology.

Consultant in Data Science

2017 - 2018

Knowit Decision Helikopter AB

(Consulting highlights at various exciting companies in Sweden described separately. See above).
Work also included shorter consultations involving design and implementation of ETL pipelines on the Azure cloud platform. As an example, a realtime pipeline, visualising patient blood pressure data using Azure functions, Azure stream analytics and PowerBI.

Data Scientist / Information Analyst

2013 - 2017

Seamless Payments AB

I was responsible for turning our data into one of our most valued assets while respecting the integrity of our customers. The work involved mining for patterns and correlations in transactional data, including basket analysis on receipt data, using the Apriori algorithm, and the use of Bandit algorithms, as a more dynamic and adaptive alternative to A/B testing within the SEQR mobile app. The work also included the development of an analytics platform, including ETL pipelines.

Software Developer

2012 - 2013

Ericsson AB (DUCI)

As part of a cross-functional team, developing the CSCF component (a core component of Ericsson's LTE offering), I was working on adapting the build system, and testing frameworks (functional and system) to a Continuous Integration (Jenkins) environment.

Research Assistant

2002 - 2008

Biocomputing and Developmental Systems Group, University of Limerick

The overall research goal was to enhance the performance and usability of the automatic program discovery system called Grammatical Evolution.

Evaluation of various target language backends.
Tremendous speedup. Before: over an hour; After: in minutes (single digit).
Code was mainly written in C/C++.
Evaluation using a large Beowulf cluster.
Analysis of large data sets.

Assistant Lecturer & Teaching Assistant

2002 - 2007

Computer Science and Information Systems Department, University of Limerick

Teaching students in various courses in the field of computer science and information technology, including the introductory course. The tasks involved lecturing, tutoring, construction of exam questions and correcting exams.

Software Developer

2000 - 2002

Virtual Genetics Laboratory AB

Designed and implemented a customer-praised GUI for a classification, regression and data mining application (Virtual Predict). The GUI included preprocessing features, such as feature selection and principal component analysis, and also a visualization framework.

Roles: Software Architect, Developer, Tester.
Designed and implemented a client/server protocol.
Implementation mostly in Java and Swing.
Agile development model.
First to migrate to Linux and OS X.
Introduced open-source tools to the team (CVS, Doxygen, Netbeans).

Publications

Candidate Oversampling Prefers Two to Tango

GECCO '11: Proceedings of the 13th annual conference companion on Genetic and Evolutionary Computation

David Wallin and Conor Ryan and R. Muhammad Atif Azad.

Evaluation of Population Partitioning Schemes in Bayesian Classifier EDAs

Proceedings of the 11th Annual conference on Genetic and Evolutionary Computation

David Wallin and Conor Ryan.

Using Over-Sampling in a Bayesian Classifier EDA to Solve Deceptive and Hierarchical Problems

2009 IEEE Congress on Evolutionary Computation

David Wallin and Conor Ryan.

Diversity in Discrete EDAs on Real-Valued and Dynamic Problems

Soft Computing, Springer Verlag.

David Wallin and Conor Ryan.

Maintaining Diversity in EDAs for Real-Valued Optimisation Problems

Proceedings of Frontiers in the Convergence of Bioscience and Information Technologies 2007 (FBIT2007)

David Wallin and Conor Ryan. Nominated for Best Paper award.

On the Diversity of Diversity

Proceedings of the 2007 Congress on Evolutionary Computation (CEC 2007)

David Wallin and Conor Ryan.

Effect of Endosymbiosis in the Symbiogenetic Coevolutionary Algorithm

Proceedings of the 7th International Conference on Artificial Evolution (EA'05).

David Wallin, Conor Ryan and R. Muhammad Atif Azad

Symbiogenetic Coevolution

Proceedings of the 2005 IEEE International Conference on Evolutionary Computation

David Wallin, Conor Ryan and R. Muhammad Atif Azad

Non-stationary Function Optimization Using Polygenic Inheritance

Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2003), Part II

Conor Ryan, J. J. Collins and David Wallin

Adaptation of Hyper Objects for Classification

Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), Graduate Student Workshop Program

David Wallin.

David Wallin

Data Scientist

Education

PhDc (ABD) in Computer Science

University of Limerick

Master in Computer Science

Uppsala University

Senior Technical High School in

Åsö Gymnasium + Skärholmens Gymnasium

Languages

Work Experience

Data Scientist

Data Engineer (Consultant)

Data Scientist (Consultant)

Lead in Big Data Analysis and Visualisation (Consultant)

Consultant in Data Science

Data Scientist / Information Analyst

Software Developer

Research Assistant

Assistant Lecturer & Teaching Assistant

Software Developer

Publications

Candidate Oversampling Prefers Two to Tango

Evaluation of Population Partitioning Schemes in Bayesian Classifier EDAs

Using Over-Sampling in a Bayesian Classifier EDA to Solve Deceptive and Hierarchical Problems

Diversity in Discrete EDAs on Real-Valued and Dynamic Problems

Maintaining Diversity in EDAs for Real-Valued Optimisation Problems

On the Diversity of Diversity

Effect of Endosymbiosis in the Symbiogenetic Coevolutionary Algorithm

Symbiogenetic Coevolution

Non-stationary Function Optimization Using Polygenic Inheritance

Adaptation of Hyper Objects for Classification

Skills

Machine learning

Programming

Databases

Tools

David Wallin

Data Scientist

Education

PhDc (ABD) in Computer Science

University of Limerick

Master in Computer Science

Uppsala University

Senior Technical High School in

Åsö Gymnasium + Skärholmens Gymnasium

Languages

Work Experience

Data Scientist

Highlights

Data Engineer (Consultant)

Highlights

Data Scientist (Consultant)

Highlights

Lead in Big Data Analysis and Visualisation (Consultant)

Highlights

Consultant in Data Science

Highlights

Data Scientist / Information Analyst

Highlights

Software Developer

Highlights

Research Assistant

Highlights

Assistant Lecturer & Teaching Assistant

Highlights

Software Developer

Highlights

Publications

Candidate Oversampling Prefers Two to Tango

Evaluation of Population Partitioning Schemes in Bayesian Classifier EDAs

Using Over-Sampling in a Bayesian Classifier EDA to Solve Deceptive and Hierarchical Problems

Diversity in Discrete EDAs on Real-Valued and Dynamic Problems

Maintaining Diversity in EDAs for Real-Valued Optimisation Problems

On the Diversity of Diversity

Effect of Endosymbiosis in the Symbiogenetic Coevolutionary Algorithm

Symbiogenetic Coevolution

Non-stationary Function Optimization Using Polygenic Inheritance

Adaptation of Hyper Objects for Classification

Skills

Machine learning

Programming

Databases

Tools