EXPERIENCED
MACHINE LEARNING
AND DEEP LEARNING
DATA SCIENTIST

Transform raw data into intelligence data

SERVICES

ABOUT US

EMMALUK CAN:

Transform raw data into intelligence data

Help businesses turn raw data into intelligent data.

Help you explore, analyse and visualise your data.

Offer extensive international experience either leading a team or individually undertaking software development, testing and implementation.

Please review my employment history and the milestones from this period as outlined within the attached Curriculum Vitae.

OUR SERVICES

PERSONALISED ANALYTICS

At our company, we take the raw data from our clients and turn it into a meaningful result, working closely with our users throughout our process to ensure that the analysis is relevant.

We are committed to providing clean, in-depth datasets.

RIGOROUS TESTING

Once we have finished our initial analysis of our client's data, we perform multiple quality checks.

These tests are included in the price of the analysis package.

After testing, we deliver the results to our clients.

We provide a "typo-free" certification along with our clean data.

QUALITY GUARANTEED

The world of science and technology can be hard to keep up with. That's why our goal is to provide our clients high-quality visual analytics. No matter the discipline or type of data, we pride ourselves on providing professional results. We guarantee you will be satisfied with our work.

  • Information is the oil of the 21st century, and analytics is the combustion engine.
  • Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.

SEE MY PROFESSIONAL CV DATA SCIENCE PORTFOLIO FOR MORE INFORMATION

VIEW CV

PORTFOLIO & PROJECTS

A SELECTION OF MYDATA SCIENCE WORK

1.

DEEP LEARNING WITH TENSORFLOW LONG SHORT-TERM MEMORY (LSTM) NEURAL  NETWORK FOR STOCK MARKET PREDICTIONS WITH PYTHON.

Deep learning is a subset of machine learning in artificial intelligence (AI) that is capable of learning from data.

PS: In deep learning, we use a loss function that quantifies the badness of our model, a model that is underfit will have high training and high testing error, while an overfit model will have extremely low training error but a high testing error.

FIGURE 1

Shows Predicted Stock Prices (red) and Actual Stock Prices (blue)

FIGURE 2

Figure 2 shows that a model is correct or just right training error (red line) slightly lower than test error (blue line).

FIGURE 3

Plot of 'Close' for Global Equity Income sector price history

2.

ANALYSING HOUSE PRICES AND CRIME IN R

Unsupervised learning: Principal Component Analysis (PCA):

  • The aim was to investigate how current house prices were affected by recent crime levels in London Boroughs
  • Performed data cleaning, transformation, manipulation and conducted Principal Component Analysis (PCA): this was the process by which compute principal components and used them for better understanding of the data. PCA is considered an unsupervised machine learning method because it involves only a set of feature variables and no associated response variable. PCA also serves as a useful tool for exploratory analysis and data visualisation

IN WANDSWORTH

House 51: £ 538,999.3

House 104: £ 973,938.4

IN HOUNSLOW

House 71: £689,162.6

House 49: £396,623.0

FOR PCA

House 104 & 71 data points are near House Price point. They are more expensive.

House 51 and 49 data points are near Crime point. They are less expensive. This is an indication that crime has some effect on house prices.

Left: Figure 4: 3D scatterplot using three principal components

Right: Figure 5: 2D scatterplot using two principal components

3.

AN EXAMPLE FROM WHEN I WORKED AT MERCEDES-BENZ UK

The first ‘RV Central’ in the UK, which provides the residual value of your used car. 

 

This system calculates the residual value for used cars and has since been implemented by Mercedes-Benz throughout all their UK dealerships. 


I also applied data science skills and new technology to improve business processes and business efficiency and reduced costs; this helped the company implement automated processes that would previously have been paper-based, resulting in increased efficiency and reduced environmental impact.

4.

EXAMPLES OF A/B TESTING AND MULTIVARIATE TESTING (MVT) FROM WHEN I WORKED AT AXA

The two examples below depict different versions of the same webpage, which were used to provide insight to drive future strategies and identify business opportunities and problems.

5.

EXAMPLES OF DATA VISUALISATION FROM WHEN I WORKED ON THE ‘FINGERTIPS’ WEBSITE FOR PUBLIC HEALTH ENGLAND (GOVERNMENT AGENCY)

On the left are boxplots depicting the percentage of children in low income families in the East Midlands between the years 2006 and 2014; on the right is a bar graph with negative stack depicting the proportion of males and females of different age groups in the East Midlands region.

DOWNLOAD WORK
AS A PDF

DOWNLOAD NOW

6.


INDUSTRY PROJECT:
HOW ROBOTS ARE MAKING FARMING PROFITABLE  WEATHER DATA ANALYTICS USING HADOOP

Leading the big data flow of the application starting from data ingestion from upstream to HDFS, processing and analysing the data in HDFS and data visualisation in R & JavaScript

7.

EXPLORE RELATIONSHIPS BETWEEN  WEATHER CONDITIONS AND ENERGY CONSUMPTION WITH R

  • The goal was to draw a graph that shows how the samples are related (or not related) to each other.
  • What was the relationships between weather conditions and energy consumption in London
Principal Component Analysis (PCA) Results with 2D & 3D graphs:
The project has over 11 features. PCA transformed variables into a new set of variables, which was a linear combination of the original variables. PCA is deterministic. So, the correct answer is guaranteed. It makes data plottable on a 2D graph.
 
PCA is a popular technique to transform a dataset onto a lower dimensional subspace for visualisation and further exploration. PC are Eigen-pairs. They describe the direction in the original feature space with the greatest variance. House 59, 51 and 50: PCA sensitive to outliers and may cause wrong eigendirection.

NEURAL NETWORKS

An artificial neural network was used. This kind of network has ten input layers, one output layer, and a number of hidden layers. The nodes in each layer are called neurons which perform non-linear calculations.

8.

DATA VISUALISATION IN MICROSOFT POWER BI & TABLEAU

  • The aim was to create Executive Dashboard, tracked and reported on business metrics & the KPIs
  • This dashboard included key performance, top ten products, top-performing cities and top-performing cities, customer reviews and sales by Month.

    PLEASE GET IN TOUCH IF YOU HAVE ANY QUESTIONS ABOUT MY WORK

    © 2022EMMALUK