Profilbild von Valery Khamenya Sr Data-Scientist, ML/DS Big Data (Spark, Data-mining, Text-mining, Web-mining, Python, R, MongoDB) aus Muenchen

Valery Khamenya

teilweise verfügbar

Letztes Update: 06.09.2022

Sr Data-Scientist, ML/DS Big Data (Spark, Data-mining, Text-mining, Web-mining, Python, R, MongoDB)

Abschluss: Dipl.-Mathematiker
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: deutsch (verhandlungssicher) | englisch (verhandlungssicher) | russisch (Muttersprache)

Dateianlagen

Valery.Khamenya-SrDataScientist-AI-ML-DL-2021-07.docx

Skills

I am a data scientist (MS math) from Munich with strong IT-skills and experience in:
  • Datamining / Textmining / Webmining (see dooblet.com or check this screencast: https://www.youtube.com/watch?v=mW_D51kGN2o)
  • Attribution Modeling (GAM, ARIMA, ARMA, etc)
  • data analysis (R, Python), modeling, forecasting
  • Web-Applications for data analysis
  • Explorative data analysis
  • Biostatistics: proteomics, epigenomics (other *omics are also very welcome!)

You’ll catch my extra attention, if your project targets one or more of:
  • Machine learning, deep learning (esp. LSTM)
  • Large-scale parallelism (“deep parallelism”)
  • Analysis/modeling of huge amount of data (BigData)
  • Bioinformatics, Biostatistics, Pharma, Psychology, Finance
  • Go (Golang), Scala, TensorFlow/Theano

Projekthistorie

02/2019 - bis jetzt
Chief Analytics Officer (CAO)
Ludaciti GmbH (Internet und Informationstechnologie)

  • Mentoring for Advanced Analytics
  • Multivariate Analysis, Correspondence Analysis, Cause and Effect Analysis, Factor Analysis, ICA, PCA, etc.

Tools: R / RStudio, Python / PyTorch


04/2019 - 09/2019
Spark Architect / Sr. Big Data Engineer (Spark)
Münchener Hypothekenbank eG (Banken und Finanzdienstleistungen)

• building Data Lake from scratch
    • automated Spark batch-processing
    • setting up full cycle GitLab CI and GitLab Flow
    • Spark-based data depersonalization and historization
    • defining the Software Developer Guide for a team

Tools: Spark, Hive, Hadoop, Python, PySpark, GitLab, Docker, Docker Hub, Apache Zeppelin, Hortonworks, Nexus, git, Linux toolset

08/2017 - 01/2019
Sr. Data Scientist (Big Data)
Telefónica Germany GmbH & Co. OHG (Telekommunikation)

Highlight: I am the author of geo-localization technology used by Telefonica

    • Location Intelligence derived from Big Data
    • Geo-localization of subscribers based on anonymized low-level event data produced within mobile network
    • Big Data analysis of network event data using AWS EMR and Spark 

Tools: Scala, Spark, Python, R, Zeppelin, Hive SQL, Hadoop, S3, AngularJS, AWS EMR, EC2, Docker, AWS Linux, OpenStreetMap (OSM)

05/2017 - 06/2017
Roll-out manager for the mission-critical innovation presentations
Deutsche Telekom AG (Telekommunikation)

• Edge computing, “5G” and generic AWS cloud computing 
• Advise on Python-based microservice networking (TCP/UDP/OSC/HTTP) 
• Advise on roll-out, co-working with hardware engineers
• Advise on DevOps questions

Tools: Python, C++, Docker

03/2017 - 04/2017
Attribution modeling consultant
ProSieben (Medien und Verlage)

• Consult about attribution modeling using R
• Advise on Generalized additive models for time-series (GAM)
• Advise on Autoregressive Moving Average models (ARMA / ARIMA)
• Advise on algorithms for optimal budget distribution among assets (algorithms for investment strategies maximizing ROI)

Tools: R

05/2016 - 12/2016
Team-Lead / Delivery Manager during Go-Live Phase
Vorwerk (Konsumgüter und Handel)

• Main: team-leading and delivery management within two SOA-components for the Thermomix® Cloud Platform during Cook-Key® Go-Live Phase
• Reporting to Vorwerk top-management of Thermomix® group
• Communication to the software vendors of adjacent AWS microservices
• Hands-on incident management in backend components.
• Governance processes, change & release management
• Automation of KPI-reporting
• Integration of both microservices in the Thermomix® Cloud Platform
• Hands-on collaboration with DevOps during deployment and configuration
• Automated analysis of huge amount of log-records 

Tools: Polarion, Kibana, Docker, C++, git, nginx, MySQL and other

03/2016 - 03/2016
Workshop Coach for Semantic Search and Web-Mining
AbbVie Inc (Pharma und Medizintechnik)

• Coaching AbbVie employees of the research team
• Subject targeted: “Semantic Search and Web-Mining”

Tools: rich own experience and the PowerPoint!

07/2014 - 01/2016
R&D Consultant for automated high-precision navigation
BMW Group, Forschungs- und Innovationszentrum (BMW FIZ) (Automobil und Fahrzeugbau)

• Development of high-precision road-maps for the advanced driver assistance systems and autonomous car driving
• Solution implementation in software (SCRUM dev modus)

Tools: C++, PostgreSQL, OpenCV, Linux toolset in wide range

05/2014 - 06/2014
Fraud Analyst / DB Transaction Analyst
xWare42 GmbH (Banken und Finanzdienstleistungen)

• Develop DB-triggers for the detection of the fraud transactions 
• MySQL reporting for fraud transaction monitoring

Tools: MySQL/MariaDB, Linux toolset in wide range

10/2013 - 03/2014
Consultant in Audio Recognition and Python Coach
wywy GmbH (Medien und Verlage)

• Consulting in Audio Pattern Recognition
• Coaching regarding Python, TDD, SCRUM
• Deployment, Configuration, Provisioning (Docker, Puppet) 
• Microservices Architecture / Service-Oriented Architecture (SOA)

Tools: Python, pypy, numpy, Docker, Puppet, MATLAB, Linux toolset in wide range

05/2013 - 08/2013
R&D Consultant for highly automated car driving
BMW Group, Forschungs- und Innovationszentrum (BMW FIZ) (Automobil und Fahrzeugbau)

• Assessing capabilities of sensors for the advanced driver assistance systems and autonomous car driving
• Automated assessment and evaluation of huge amount of sensor data
• Automated report generation suitable for technical and management decisions 

Tools: MATLAB, LaTeX, XML, XSLT, C++, OpenCV, Linux toolset in wide range

09/2012 - 04/2013
R&D Consultant for semantic search
Sopdu GmbH (Internet und Informationstechnologie)

• Apply semantic search engine for finding candidate, not job ads
• Optimization of search output 
• Enhancement of context-based search

Tools: Python, MongoDB, Lucene, Java, Linux toolset in wide range

10/2011 - 07/2012
Sr DevOps
Payback GmbH (Marketing, PR und Design)

• Deployment automatization in distributed environment in SOA
• Automatization of individual node configuration

Tools: Python, Oracle WebLogic 12, bash, Linux toolset in wide range

04/2009 - 09/2011
Role: R&D Consultant for semantic search
Sopdu GmbH (Internet und Informationstechnologie)

• Design of semantic search engine for the job search applications
• Extracting semantics from corpora using machine learning
• Tuning up for user preferences using machine learning
• Implementation of semantic search engine prototype
• Full responsibility for the data-mining, text-mining and web-mining
• Data-warehouse implementation
• Deployment management
• See also: Screencast Video 

Tools: Python, MongoDB, R, Lucene, Java, GNU make, bash and Linux toolset in wide range

03/2009 - 05/2011
Business Development Manager, EMC Documentum SOA
Reksoft GmbH (Internet und Informationstechnologie)

• Leads acquisition and support
• Representation of Reksoft outsourcing company in Germany

07/2007 - 03/2009
On-site project coordinator at Fujitsu-Siemens (as a permanent employee)
Reksoft GmbH (Internet und Informationstechnologie)

• Main: coordination of 30+ employees developing software in about 20 projects for Fujitsu‑Siemens 
• Representation of Reksoft outsourcing company in Munich
• Delivery management, conflict management 
• Budget & capacity planning

Tools: C# / .NET, Microsoft SharePoint

01/2007 - 06/2007
Self-employed start-up developer
(self-employed) (Internet und Informationstechnologie)

• Main: create a search service for finding alternatives to almost any well-known search term
• Search-service based on web-mining is created: dooblet.com

Tools: R, Python, Django, Linux toolset in wide range

10/2005 - 01/2007
Bioinformatician / Biostatistician / Sr Software Developer (as a permanent employee)
Epigenomics AG (Pharma und Medizintechnik)

• Main: finding epigenetic biomarkers upon DNA 
• Moderation within regular software developer seminars

Tools: R, Sweave, Bioconductor, LaTeX, Affymetrix, Linux toolset in wide range

06/2002 - 09/2005
Bioinformatician / Biostatistician / Sr Software Developer (perm. employee)
BioVision AG (Pharma und Medizintechnik)

• Main: create a parallelized software service for finding peptide biomarkers from huge amount of data obtained via HPLC+Massspectrometry approach
• Data mining in huge amount of molecular-level measurements data
• Data storytelling in explorative protein analysis
• Building up a computer cluster of 50 nodes to speed up task solving
• Developing software for the cluster 

Tools: C++, R, OpenSSI, Linux toolset in wide range

09/2001 - 05/2002
Sr. Software Developer (as a permanent employee)
Cognitec AG (Internet und Informationstechnologie)

• Main: speed-up in-house face recognition software using Intel’s Performance Libraries (my approach proposed to the team during my interview)

Tools: C++, Intel MKL/IPP/IPL libraries, Linux toolset in wide range

08/2000 - 09/2001
Head of Department, R&D Machine Learning & Neural Networks (as a permanent employee)
R&D Center Modul (Internet und Informationstechnologie)

• Main: supervise 3 R&D teams working on their individual projects based on machine learning and neural networks

• Projects successfully finished under my coordination:
  ◦ Automatic person recognition based on eye pupil images
  ◦ Automatic person recognition based on arbitrary phrase
  ◦ Automatic car surveillance system featuring speed measurements

02/1997 - 08/2000
Team-Lead, R&D Machine Learning & Neural Networks
R&D Center “NTZ Modul” (Industrie und Maschinenbau)

• Main: create neural network paradigm with a transparent model for feasible and controllable training and recognition
• Leading the team of 4 employees
• Showed that SOM-neural networks could be enhanced to be successfully used for approximating complex multidimensional data distributions similarly to Gaussian Mixture Models
• Based on enhanced SOM-networks two projects fulfilled  for Jena‑Optronik and DaimlerChrysler Aerospace 

Tools: C++, Linux toolset in wide range

Reisebereitschaft

Verfügbar in den Ländern Deutschland, Österreich und Schweiz
Munich, Switzerland or remote

Sonstige Angaben

http://dooblet.com
 

Youtube - Video

Semantic search engine created by me and applied for the job search

Profilbild von Valery Khamenya Sr Data-Scientist, ML/DS Big Data (Spark, Data-mining, Text-mining, Web-mining, Python, R, MongoDB) aus Muenchen Sr Data-Scientist, ML/DS Big Data (Spark, Data-mining, Text-mining, Web-mining, Python, R, MongoDB)
Registrieren